• Skip to main content
  • Skip to footer

Audit My PC - Free Internet Security Audit

Firewall Test and web tools to check your security and privacy

  • Firewall Test
  • Anti Spam
  • Internet Speed Test
  • Sitemap Generator
  • Whats My IP

Sitemap Generator

Our free sitemap generator not only allows you to build a XML sitemap for Google, Bing and other search engines, but also includes tools that help discover problems that may be preventing your site from ranking well on search results. Best of all, it’s completely free, no limits and nothing to download!

Sitemap Generator for Google MSN and others** IMPORTANT ** The SITEMAP GENERATOR respects sessions! If you are logged into your site and have delete privileges, the sitemap generator will follow all links, including ‘delete links’, so play it safe and make sure you are NOT logged into your website!

Current Version of the XML Sitemap Generator is v2.14. The main focus of this update was on the indexing speed of very large websites and is now 250% faster than the previous version. IF YOU HAVE PROBLEMS running my tool, see my section on using Firefox ESR.

How to Make a Sitemap

For those that want to skip the instructions: XML Sitemap.

At one time, each search engine had their own idea of how a sitemap should be formatted, fortunately a SITEMAPS standard was developed for XML sitemaps that Google, Bing, Yahoo and other SE’s now adhere to.

There is still the traditional html sitemap, and we have taken that into consideration when building the sitemap generator; our webmaster tool provides you with the option to generate a XML sitemap, HTML sitemap, raw list of urls, session report and a html report – how you save (export) the file is up to you.

Before the XML Sitemap, website owners used a HTML sitemap to get their content recognized by search engines, and it still works extremly well! The nice thing about this type of sitemap is that it can help visitors navigate your site while allowing search engines to find your content.

GENERATE A SITEMAP FAST

Image of generating a xml sitemapIt’s simple, just enter the website you would like to generate a site map for (found under the ‘settings’ tab) and click the little green arrow to start crawling.

The sitemap generator will spider your site using the default settings and give you the option to create a xml sitemap (or html sitemap depending on your need).

Any errors, such as missing pages, duplicate titles or overly large files that may be slowing down your site will be listed for your review.

SITEMAP GENERATOR DETAILS (the manual)
This site map generator (now part of the webmaster tool) is loaded with tons of features and consists of six tabs:

‘Project’ tab – allows you to save and load your sitemap project. This can be very handy when making a XML Sitemap for a large website with a thousands of pages. If you decide to use filters after you have crawled your site, you need to select “New Project” and run it again to obtain your new sitemap. Note: A sitemap project file is NOT the same as XML Sitemap (which is found under the Sitemap Tab).

‘Settings’ tab – allows you to specify the way your site will be spidered and what will be included in your Google or XML sitemaps.

  • Project Name – This will be the name of your project
  • URL – The full address (including the http://) of the website you would like to create a Google or XML sitemap from
  • Filters (case sensitive) – you can tell our tool to include or exclude certain files or content when you generate a sitemap. Regular Expressions are supported when you prefix it with “complex:”.
    • Include / Exclude Filter – This is a list of path patterns, asterisk (*) wildcard supported, case sensitive. When the sitemap generator is about to crawl a website, it is validated against all inclusion patterns. If none match, then the location will not be processed and will not be placed into sitemap. If you leave this area empty, then it is assumed that you want to include everything so that all locations will are process being excluded from processing.
    • Include / Exclude Content Type Filter – same rules apply as with include/exclude filters and target the type of content.
    • Here are some examples of exclude filters I used for my WordPress Sitemap:
      */trackback/*
      */feed/*
      */feed
      */comments/*
      */tag/*
      */author/*
      */wp-content/*
      */wp-json/*
      *xmlrpc*
      *wp-admin*
      *.css
      *.xml
      *.zip
      *.swf
      *.jpg
      *.jpeg
      *.png
      *.pdf
      complex:.\?.

      Note: The last complex statement would exclude all pages that start with /shopping/food/ and contain only letters.

  • Informal Links Regex – Allows you to search for links that are not standard, such as hidden comment spam that may link out to other sites.
    Ex: (?i)[a-zA-Z0-9\-\.]+\.(com|org|net|mil|edu)
  • Rules – Allows you to set rules when creating your XML, HTML or Google Sitemap.
    • Load From – Provides you with the option to process files from the entire server and below the initial directory which is specified by URL parameter. For example, if the URL parameter is a server address, this option does not effect the behavior of the google sitemap generator; However, if you enter a directory, say for example http://www.popupcheck.com/news/index.html, only files below /news directory will be processed including any sub directories.
    • Respect Robots.txt file – you can tell the sitemap generator to honor this file or to ignore it.
    • Respect Meta Robots.txt – you can do as this meta tag instructs or ignore it.
    • Respect No Follow – If the sitemap builder finds a link with a no-follow tag, it will ignore or follow it depending on your selection.
    • Ignore invalid links – If you find links that try to back up past your root directory, then you can choose to not include this in your sitemaps.
    • Exclude images – Check this. Images will not be included anyway (not used)
    • Download only new files – this works when you have a sitemap project that you have saved out
    • Case Sensitive URLs – Treat URL’s that have different case as unique
    • Skip unmatched non=canonical links – If a page has a canonical url that differs from itself, it will not be included in the sitemap
  • Options – Some cool stuff here and can be very important!
    • Add skipped links to sitemap – when the webmaster tool crawls your site, it may find bad links. This option allows you to include those in the sitemap anyway.
    • User agent – When you spider a site, you leave in the log files the name ‘AuditMyPC Webmaster Tool’ as the browser type. Some web hosting companies may block a browser if it is making too many requests. You can change the referrer (user agent) to something else by selecting from the drop down or typing it in!
    • Max Level – This is not the file depth in a directory structure, but 1
      + number of links between this document and root document (project
      settings url). For example, if the settings url is set to testingiam.com, then the document level is 1, this links to testingiam.com/level1/ so this page
      level=2 and that page links to testingiam.com/level1/level2/level3/level4/level5/level6/level7/ which would = level 3.

‘Crawler’ tab – Set the speed in which the sitemap is generated.

  • Request Delay – Our XML sitemap generator works extremely fast, the downside of this is that some internet service providers may find this places a heavy load on the server. If this is the case, then you can place delays between requests.
  • Connect Timeout – When building XML sitemaps which encounters pages that don’t load or take too long, a timeout can be set.
  • Read Timeout – If the spider finds a page that goes on forever, you can specify a timeout for reading that page.
  • Transfer Rate – Each thread can transfer web pages at a very fast rate. You can tone that down a bit if necessary, but the default works fine.
  • Thread Count – The number of simultaneous crawling threads to run when creating the Google site map. This may significantly decrease overall crawling time if large number of threads are specified but will increase bandwidth usage – so use with caution or just run with the default.
  • Autosave Interval – Tells the sitemap generator to save the project out every X number of minutes – default is don’t save. Change this if you have a vary large site!
  • Once you click on the button, crawling will begin and you’ll be presented with status indicators for thread status, uri, values and more. All parameters are self explanatory and ‘Finished’ will appear once crawling is completed. You may stop the sitemap generation at any time by pressing the ‘cancel’ button.

‘Sitemap’ Tab – Contains a ton of information about each page and updates in real time.
all the locations / files that have been crawled. Under the ‘Sitemap’ tab, you have sub-tabs, such as ‘save sitemap’, ‘retry’, ‘row filter’, ‘column filter’ and ‘trees’.

  • Export – this is where you decide what type of sitemap you would like to build.
    You can choose ‘Sitemap XML (For creating XML sitemaps used by Google, Bing, Yahoo and others) ‘, ‘URL Raw List’, ‘Delimited File’, ‘Session File’, ‘HTML (Sitemap Only – old style sitemap, not a XML sitemap) and a ‘HTML Report’.
  • Retry Failed – This option will retry to read pages from the sitemap that had problems on the last run
  • Row Filter – When building a sitemap (crawling a site), you can filter out rows based on just about anything you can think of (see the question mark next to each item for details). For example, Google released results of what they found when indexing the top sites; one of those metrics is the average size of a web page, which was 312KB, so you could enter 319528 as the length filter (needs to be in Bytes) and find all the pages that Google considers large – and fix them.
  • Column Filter – Same as the Row filter but for Columns.
  • Find – This allows you to search your xml sitemap for text.
  • You have the option to edit ‘Modified’, ‘Change frequency’ and ‘Priority’ cells for each row (or all rows – well get to that in a moment).
  • Listing of URLs (pages) – You’ll see a listing of all your web pages that include items like Title, Status, Errors and more.
    For the Google sitemap, you can set the ‘Change Frequency’ and ‘Change Priority’ for on or multiple urls by highlighting the desired url(s) and right clicking, then choosing your option. You can also delete page from your Google or XML Sitemap by simply highlighting the desired urls and pressing your computer’s delete key.

    • Change frequency – Tells Google Sitemaps the frequency that content of a particular URL will change. Your options are “always”, “hourly”, “daily”, “weekly”, “monthly”, “yearly” or “never”. The value “always” should be used to describe documents that change each time they are accessed. The value “never” should be used to describe archived URLs.
    • Priority – The priority of a particular URL relative to other pages on your site. You may select between 0.0 and 1.0, where 0.0 identifies the lowest priority page(s) on your website and 1.0 identifies the highest priority page(s) on your website.

‘URL Check’ Tab – This is an important tool to finding out why a page does not load or why Bing, Google, Yahoo and other SE’s are not including it in their index. It’s a great way to Check Server Headers and allows you to modify the request properties!

  • URL – Enter the full website address or url that you want more information about
  • Request Properties – Enter values to send to the server such as the user agent or referrer. To enter a user agent and referrer when validating a page, simply enter:User-Agent=X
    Referer=X

    Where X equals the value you want. Here is an example:

    Referer: https://www.auditmypc.com
    User-Agent: Mozilla/5.0 (compatible; ScoutJet; +http ://www.scoutjet. com/)

    If you ran the URL check on your site with these settings, your log files would show that the request was made by the blekko bot (scoutjet) and the visitor was referred by the site auditmypc.com

    Note: You can use this section of the tool to test security, website behavior and more…

  • Save Content – You can also send the server headers and other information to a file or content view. When you click the ‘to content view’ option, then click start and then click the ‘Content’ Tab (next to ‘Request’), you’ll see the content / source of the web page.If you then click ‘Parse document info’, you’ll see a document tree and document info. The document tree is more advanced and can help webmasters discover missing head tags, body tags and more.

    The Document Info will show you the title, number of links, meta tags and a listing of all the links found on that page.

‘System Information’ Tab – Shows you how much memory Sitemap Generator (Java) has available for use. If you are checking links or building a XML Sitemap for a large site, you’ll want to allocate more memory. Allocating more memory is as simple as issuing a command – see this 60 second video on how to increase Java memory for more information

To increase the memory available to Java, simply add the parameter –XmxNNNm, where NNN is ½ of your total conventional memory in megabytes. On Windows, this is done through Control Panel -> Java -> Java tab -> Java Applet Runtime Settings -> View.

For example, say that you are running the link checker on website containing 500,000 pages, simply type “-Xmx512m” in “Java Runtime Parameters” field (provided you have at least a total memory of 1GB – on average, you can go up to half of your computer memory).

Create the Sitemap – Export the Sitemap XML file.

Once the sitemap generator is finished crawling your website you need to export the sitemap file for search engines. Simply select the SITEMAP tab, then EXPORT, then SITEMAP XML. If you go with the defaults, it will save a sitemap called “New sitemap_sitemap.xml” into your default folder (usually your “My Documents” folder. Once you have the sitemap file, simply upload it (ftp, transfer) it to your website’s main folder and let the search engines know the location.

sitemap.xsl example

Note: if you add a stylesheet reference (doesn’t matter to the bots, but looks great and easy to read), then you’ll need the following sitemap.xsl (it’s zipped up) file placed in the same location as you place your sitemap.xml file)

Google Sitemap Generator – How to Submit Sitemap to Google

  • When the sitemap generator has completed the crawling process, select Export under the Sitemaps Tab and choose Sitemap XML
  • Enter the filename you would like to save the sitemap as and click save (the default is sitemaps.xml which is fine)
  • upload the new sitemap to your website. It seems every hosting company has a different method of doing this, but they are all basically the same – Think of your sitemap.xml file as any htm (html, php or asp) file that you’re going to place on your website. There is probably some type of import option that your hosting company provides you – use it to move (FTP, Publish, etc) the sitemap.xml file from your computer onto your website. Place it in the same directory that holds the main page for your website.
  • Log into your Google Sitemaps account by visiting Google Sitemaps Account.
  • Click on the “Add a Sitemap” link.
  • Enter the URL for your Sitemap in the field, then click the [Submit URL] button.
  • Example: The URL for your Sitemap will be your website address, followed by the filename that you uploaded. For example, if I uploaded my ‘sitemap.xml’ file to my auditmypc.com site, the URL I would give to Google Sitemaps would be https://www.auditmypc.com/sitemap.xml

This will submit your Sitemap to the Google service. It may take Google a few hours to generate reports about your site, so be patient while they work their mojo.

Bing Sitemap Generator – How to Submit Sitemap to Bing

  • Create the sitemap as normal using our Sitemap Generator
  • Click the ‘Save Sitemap’ tab located under the ‘Sitemap’ tab
  • Select ‘Sitemap XML’ to save it out as the name you would like and then upload the sitemap to your website
  • Submit your XML File to Bing Webmaster Home

Yahoo Sitemap Generator – How to Submit Sitemap to Yahoo

  • Create the sitemap as normal using our Sitemap Generator
  • Click the ‘Save Sitemap’ tab located under the ‘Sitemap’ tab
  • Select ‘Sitemap XML’ to save it out as the name you would like and then upload the sitemap to your website
  • Submit your XML File to Yahoo Site Explorer – UPDATE: No Longer in service
      You can provide Yahoo Sitemaps with a feed in the many formats other than XML (stick with XML).

    • RSS 0.9, RSS 1.0 or RSS 2.0, for example, CNN Top Stories
    • Sitemaps, as documented on sitemaps.org
    • Atom 0.3, Atom 1.0, for example, Yahoo! Search Blog
    • A text file containing a list of URLs, each URL at the start of a new line. The filename of the URL list file must be urllist.txt; for a compressed file the name must be urllist.txt.gz.

XML Sitemap in Robots.txt File

Don’t want to have an account with each search engine to submit your sitemaps to? There is a solution! You can put the location of your sitemap inside a ROBOTS.TXT file. Every search engine will read your robots.txt file before crawling your site and if it see this line:

Sitemap: [website address]/sitemap_location.xml

then it will find your sitemap without your having to do anything else.

Here is an example of a robots.txt file, which you can use if you don’t already have one:

User-agent: *
Disallow:
Sitemap: https://www.auditmypc.com/mynewsitemap.xml

Simply replace my location with your information.

Benefits of this Online Sitemap Generator over other Sitemap Tools

One of the major advantages of using this tool is that owners of websites find errors they never knew existed on their sites! WordPress, Joomla, Drupal, phpBB and other content management systems all have sitemap programs you can add onto the system, but these sitemap generators read from the database, NOT from the outside; although faster, they miss errors that can only be seen from an outside crawl – these errors most often prevent sites from being indexed properly by Google, Bing, Yahoo and others! Once fixed, website owners usually notice a major increase in search engine traffic!

Let me give you a real life example – It starts off with trying to make a sitemap for Google and discovering that the sitemap generator simply stops at the main page and doesn’t find other pages within the site.

It is this very problem that people often write to me about. Almost always, after reviewing their site, I discover web pages that are missing beginning or ending tags, such as html, head and body tags.

I also discover that a large number of site owners are accidentally blocking robots from visiting their page. In the sitemap builder, you’ll notice that there is an option to honor robots.txt files and no follow tags.

If you’re having a problem with the sitemap builder and your page is formatted correctly, try deselecting the robots and no follow options. If this works, then the problem is with one of these items.

Every attempt has been made to make our webmaster tool behave as the search engine robots do when spidering a site. There are standards that each search engine subscribes to when reading websites and we subscribe to that same data. The point is, if we catch the errors, it’s a very real possibility that the search engines will also.

A perfect example would be a hosting company blocking our spider because it’s going to fast – chances are, the hosting provider is also doing this to the search engine robots and could be preventing them from seeing your entire site (which could lead to poor ratings). – See changing user agent above for solutions to this problem.

Common Sitemap Builder Problems

Problem: I click the image / link to run the sitemap generator but nothing happens. All I see is a page with a few links including a link to donate a cup of chai for offering such a cool tool, which I’d be happy to do if it worked :)

Chances are you’re not running Java or have an old version. Java is free and you can check your version at java.com/en/download/installed.jsp – once java is running, you’ll see the program and fall in love :)

Problem: You have created a sitemap but it picked up hidden files which you don’t want the search engines to see, so you deleted them from the sitemap.xml file, but the search engines still see the files.

Solution: If the sitemap builder can find these pages that you think are hidden, then so can the search engines. Sure, you can exclude them from the sitemap.xml, but the problem is that you are linking to these hidden files from one of your webpages. Click on the plus next to that URL under the Sitemap Tab of our generator and you’ll find all the url linking to that hidden file.

If want to exclude the files rather than hide them, you can exclude them in your robots.txt file. My sitemap builder will respect the robots.txt file (obey it), just like the search engines and prevent them from being included in the site map. Note: Not all search engines respect your robots.txt file and may look at the url regardless.

Problem: You enter your website address and nothing happens.

Solution: Is that really your website’s main page? For example, you might have entered http://[yoursite.com] as the address, but if you type this in the browser and you end up at http://www.mysite.com/index.shtm, then you have a landing page that is different than your website address.

In this case, you would enter http://[yoursite.com]/index.shtm as the site address.

Problem: I want to exclude images and css files.

Solution: Check the ‘Exclude Images’ and enter *.css as an exclude filter or enter the following in the exclude area:
*.jpg
*.bmp
*.gif
*.tiff
*.css

Problem: You want to capture only urls that in the men sub-directory containing only numbers, letters and a forward slash:
http://[yoursite.com]/shopping/men/casual/21/2
http://[yoursite.com]/shopping/men/sports/soccer
but not:
http://[yoursite.com]/shopping/men/sports-2 (has a dash)

Solution: Use a regex expression by prefixing it with ‘complex:’, for example:
complex:http://[yoursite.com]/shopping/men/[A-Za-z0-9/]*$

About regex:

  • Entire url (including protocol and host) matched against pattern, for example:
    complex:http://[yoursite.com]/shopping/men/[A-Za-z0-9/]*$
  • Pages filtered out are not processed, i.e. lets say we have root page that references page A, which, in turn, references page B, and page A doesn’t match filter rules, then you’ll will never reach page B.
  • ** Quick Reference **
    [A-Za-z0-9] = Alphanumeric characters
    [A-Za-z0-9_] = Alphanumeric characters plus “_”
    [^A-Za-z0-9_] = Non-word characters
    [A-Za-z] = Alphabetic characters
    [ \t] = Space and tab
    [\x00-\x1F\x7F] = Control characters
    [0-9] = Digits
    [^0-9] = Non-digits
    [\x21-\x7E] = Visible characters
    [a-z] = Lowercase letters
    [\x20-\x7E] = Visible characters and spaces
    [-!”#$%&'()*+,./:;<=>?@[\\\]^_`{|}~] = Punctuation characters
    [ \t\r\n\v\f] = Whitespace characters
    [^ \t\r\n\v\f] = Non-whitespace characters
    [A-Z] = Uppercase letters
    [A-Fa-f0-9] = Hexadecimal digits
  • + Match one or more of the previous items (previous character) so, the expression Rob+in would return Robin, Robbin, and Robbbbin. Alternatively, you can build a list of Previous Items by using square brackets. Like this: [abc]+ This will return a, ab, cab, c, b, bbbb, etc.
  • The carat (^) matches the beginning of the document. Applying ^a to abc matches a but ^b would not match because it doesn’t start with b
  • The dollar sign ($) matches the end of the document.
  • A backslash (\) followed by any special character matches the literal character itself, that is, the backslash escapes the special character.
  • The # and – characters must be escaped in expressions (## –) just as though they were special characters.
  • A period (.) matches any character, including a new line.
  • A asterisk (*) matches 0 or more of the preceding character (note that it will not be able to match an ending forward slash but period will).

Problem: You have a WordPress site and want to exclude shortlinks (like testingiam.com/?p=31) from your xml sitemap.

Solution: Use a regex expression on one of the lines in the exclude url section.
complex:.\?.
The command above will exclude any url with a ? in it.

Problem: Site Map Generation slows down after 20,000 pages

Solution: Some webmasters have noticed that during a crawl of a very large site, the sitemap generator may slow down after spidering about 14,000 urls. This can happen if the site is heavily nested or has a complex linking structure.

People who experience this lag usually have rapid applet memory consumption and need to increase the amount. The site map builder by default is limited to 50-100m which can quickly be consumed on a complex web site.

To solve this problem, you can increase the amount of memory used by the site map builder. Simply navigate to the control panel and click on the Java Icon. Then, inside the Java Control Panel, click on the Java Tab, Java Applet Runtime Settings, View and then in the Java Runtime Parameters cell, enter ‘-Xmx256m’.

You can take it a step further when building a sitemap (if you’re still having problems) and enter ‘-Xmx512m’.

Problem: You enter your website address and the site map builder stops immediately.

Solution: This is caused because your main page is redirected to another page (landing page). For example, you may have yoursite.com being redirected to yoursite.com/sales/products/sindex.htm

If this happens to you, simply enter your website address into your browser and notice where you are redirected to; take that redirected website address and enter it into the sitemap generator.

In the example I used above, you would enter:

yoursite.com/sales/products/sindex.htm

into the sitemap generator.

Problem: The sitemap generator find JPG files even though you’ve ticked the “exclude images” option.

Solution: Add the extension to the exclude filter as well, such as *.JPG

Problem: The sitemap generator misses a few or many files.

Solution: If you are having problems building a sitemap, it may be due to your Robots.txt file or your Metatag. Try unchecking the Follow robots.txt rules and/ or meta name robots rules.

Problem: I can’t see the webmaster tool graphic button, so I can’t start the test.

Solution: If that’s the case, then your browser settings may be preventing sitemap generation.

In IE, look under Tools, Internet Options, Security, Custom Level, Scripting of Java applets and choose prompt. Active scripting should be enabled as well.

In Firefox, look under tools, options, web features and make sure the Enable Java and JavaScript is selected.

If after trying these you still have a problem, please let me know and I will do my best to get you up and running.

Problem: Google Sitemap Invalid Date Error Message

Solution: If you get an ‘invalid date’ when you submit your sitemap, check to make sure that the time in not in the future. A common mistake is to not to account for daylight savings when creating the sitemap, so make sure you use the time zone for your server and not the local timezone.

Note: This sitemap generator runs on your PC and not the server.

Problem: Only one url (page / website address) shows in the sitemap generator and I know I have hundreds of pages?

Solution: Open your browser and visit your website’s main page. When you see the main page in the browser, copy the website address and paste that address into the sitemap generator URL field under the settings tab. Don’t type it in, copy and past the entire address just as it appears and you’ll be all set.

When my tool builds a sitemap, it needs a valid starting url. Chances are, you have given it a url that is a redirect. For example, if I gave it http://AuditMyPC.com, it would stop, it needs https://www.auditmypc.com (auditmypc.com redirects to www.auditmypc.com).

 

How to Run my Sitemap Generator/Webmaster Tool

Firefox, after version 5.2, has disabled support for Java Apps in the standard version of their browser. However, the version most government agencies, universities and other large organizations use, is Firefox ESR (Extended Support Release). There is a 32 bit and 64 bit version, you’ll want to use the 32 bit version. It works great and will allow you to run my sitemap generator – trust me, it’s a small price to pay to discover the problems my sitemap generator finds with your website.

Here are the steps…

1) Search Google for Firefox ESR

2) Click the download button

3) Select the version for your language, but do not choose the 64 bit version.

4) Revisit my website’s sitemap generator page and click the accept button when you get a security warning.

I could have paid for a certificate so this message would not appear as I have in the past, but I do not charge visitors for my sitemap generator/webmaster tool, so I’m no longer paying the fee. You either trust me or you don’t, it is entirely up to you. I have had this website for almost two decades now and can read more on my about page.

Like my Sitemap Generator?

Free SitemapLike our sitemap builder? Please let others know by displaying this icon. Simply copy and paste the code snippet below onto your webpage:

<a href=”https://www.auditmypc.com/free-sitemap-generator.asp” target=”_blank”><img alt=”Sitemap Generator” src=”https://www.auditmypc.com/images/sitemap-generator-80×15.gif” width=”80″ height=”15″ border=”0″ /></a>

Or, perhaps this XML Generator icon

XML Sitemap GeneratorXML Sitemaps – 1kb at 80 x 15 in .gif format.

<a href=”https://www.auditmypc.com/free-sitemap-generator.asp” target=”_blank”><img alt=”XML Sitemap Generator” src=”https://www.auditmypc.com/myicons/xml-sitemap-generator.gif” width=”80″ height=”15″ border=”0″ /></a>

Reader Interactions

Comments

  1. Jim says:

    Thank you for your help and offer Standa! Your kudos is gift enough, thank you!

  2. Jim says:

    Hi Mark,

    The latest version of IE does not run Java. You will need to run Firefox ESR as mentioned at the top of this page – see my section on how to run my website tool.

    Enjoy!

  3. Mark says:

    Sitemap generator won’t work on IE with the latest Java installed. It says that the code is too old, or something like that.

  4. Standa says:

    Hello! Problem solved! Java really changed the security settings in the latest update and your sitemap application could not be run (it was blocked by Java or browser settings etc.) and I red few things and then I realized it is easy: Java has also its settings and there is exception list which should not be blocked by new high security settings – I placed your web into exception list and it works (of course Java was warning me few times :-)). Great – I found that only your application works also with pictures and many, many links. Unbelievable. You are the best, if you want me to send 50 USD at your bank account as fee for using – I am ready to send it :-). Best regars, Stanislav Smiga

  5. Standa says:

    Hello, your website is world wide known and your sitemap tool is very good, but latest JAVA started some kind of security check and I cannot start it any any machine, at any browser. Do you know more about what is going on? What do they expect and if you are able to meet those requirements? You do very good job, your tool is the best I know – and I am not saying that only because it is free to use. Best reagards, Stanislav

  6. Mario says:

    Further to my previous message… I exported it to CSV and noticed that these files were excluded because of “Load from server rule”

  7. Mario says:

    Hi. First of all thanks for this sitemap software. It looks awesome.

    I have a problem…. I have several files which somehow where not included in the sitemap – they are not being excluded by robots.txt. In fact they are already indexed by Google. So I ticked the ‘Include skipped files’ option and they can be seen in the Sitemap tab.
    However when I export the sitemap to xml the skipped files are not included. How can these be included?

    Thanks

  8. Jim says:

    Hi Matthias,

    Can you tell me what system and setup you have?

    Thanks

  9. Jim says:

    Hi Kevin,

    I do this for free, have a full time job and family so I try to respond as soon as possible. It works great for me, and others, and I’ve tested on multiple machines as well.

    What version of Java are you using?

  10. Jim says:

    Hi Ted,

    Simply start a new project and it will delete the old session.

  11. Jim says:

    Hi Chris,

    Is the website the last part of your email? If not, let me know what it is (I won’t publish it) and I’ll take a look. If I can do it for free, I will.

    Best.

  12. Chris says:

    Can I pay you a fee for you to create my sitemap. I have approx 240,000 product pages and want it done right and get all my pages to be crawled and possibly indexed. Thanks!

  13. Lhr says:

    I just wanted to say this tool is awesome.

    Thanks for providing such a great tool for free.

  14. ted says:

    Hi,
    It appears that after crawling the home page it crawls the urls in order in which they appear on the home page. I see that it retains the session between crawls. Why does it do that? My understanding is that google approaches each page fresh.

    I assume the idea is to mimic user behavior for how they might peruse a site, but since you go in order of appearance of links in my case it results in something that never can happen: a user cant go from an Austin/TX url to a Boston/MA url . The site generator ends up being redirected to my home page it tries to do that, and the end result is the links for half of my 50 cities are not ever discovered.

    A solution for ensuring that it picks up all of the links is to destroy the session at the end of each page. Is that ok to do or does it in some way mess up something else the generator is trying to do?

    thanks,
    Ted

  15. Kevin Boff says:

    yes used sitemap generator for years but since last week it no longer works and have done the java check which confirms we have the latest installed java8 update20, so all you get is message waiting for www.auditmypc.com and nothing happens symbols continue to just go round and not connect, so very frustrating!!

    any ideas what wrong ??

  16. Matthias says:

    Hi,

    thanks for this great tool !
    When opening a saved project it says “Invalid reference count: 0”. After pressing OK, all settings are gone. Storing it again, then saved project xml is only have size.

    Two other questions:
    When I start then crawling I need to check “Download only new and modified files” right ?
    Changes in Settings will be taken into account after saving project or already after stop + start crawling ?

    Thanks in advance
    Matthias

  17. Jim says:

    Hi Pete,

    Thank you for making my day :)

    I did a quick look at the site and ran the xml generator with a useragent of Googlebot/2.1 (+http://www.google.com/bot.html). What I found right off was an error message telling me the page was forbidden because of flooding.

    The only time this ever happens is when you are hosting with a company that controls the rate at which you are being spidered. This is NOT something you want as the Googlebot may not adjust their rate to fit your needs and will result in your pages not being indexed!

    If I slow the crawl way down, the forbidden errors go away.

    Solution: Tell you web hosting company to stop controlling the rate at which your site is spidered.

    Do this, find your competition’s website on the web, then spider their site for about 60 seconds and stop. I’ll bet you’ll see there are no such errors and they are not rate controlled.

    Good luck!

    Jim

    Some Google Bot user agent strings…
    Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
    Googlebot/2.1 (+http://www.googlebot.com/bot.html)
    Googlebot/2.1 (+http://www.google.com/bot.html)

  18. Pete says:

    Hi Jim first let me thank you for providing a excellent resource, I have a question when I crawl my site it finds 33 failed (403) out of a total of 43 pages.

    If I then keep clicking on retry failed it eventually succeeds on those pages “is this normal” ?
    ps: I hope you enjoy the chai

  19. Jim says:

    Dave, you are the man! Thank you so much for dinner, I greatly appreciate it!

  20. Dave N says:

    Besides finally getting my sitemap done, your export to xyz_sitemap.html (showing web site structure via indents) is the answer to my prayers. Just bought Jim a Red Lobster gift card!

  21. Jim says:

    Hi Eva,

    Look at the jpg files, if they are .JPG or .Jpg, then add those exclusions as well. They are case sensitive.

    *.JPG
    *.jpg
    *.Jpg

    Sorry, I need to change this on the next update.

    Jim

  22. Eva says:

    Hi Jim,
    *.jpg in the ‘exclude url filters’ area doesn’t work
    Did I something wrong?

    Tank you!
    Eva

  23. Jim says:

    Hey Chris, when someone claims this, it is the project they are submitting, not the sitemap. You need export a xml sitemap from my program – don’t select save project as…

    What is the address of your sitemap and I’ll take a look to confirm this.

  24. Chris says:

    I have tried several times to submit the generated xml sitemap as well as an html sitemap to google and all I get from google is “Your Sitemap does not appear to be in a supported format. Please ensure it meets our Sitemap guidelines and resubmit.” Google gives this same error whether I submit as xml or html. I have no errors on the build and have checked the guidelines which are not very helpful as it would be assumed that the builder itself would make compliant files or give you help on what would be wrong and to make compliant. Any help is appreciated. Thank you.

  25. Jim says:

    You made my day Eva, thank you!

  26. Eva says:

    Jim, thank you very much!
    ( Donated a Chai tea :-) )
    Eva

  27. Jim says:

    Hi Eva,

    Just enter the name of your website and also under useragent, enter what I posted above then run the xml sitemap generator. If it gives you an error, click the green start button again and you’ll be all set.

    Best,

    Jim

  28. Eva says:

    Hello Jim, you wrote:
    ”
    Hi Eva,

    There is something not right here – If I play with the useragent string, I get different results!

    For example, run the sitemap generator, but in the useragent field, put this:

    Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120427 Firefox/15.0a1

    Then run the generator. If your server / host is restricting traffic on useragent string, it could be hurting your site. Google uses different useragent strings to review your site, as do mobile devices. This was a cursory review but in that short time, that popped out.

    And yes, looking at the sitemap, times, links and more can tell you soooo much :) It can give you that advantage you are looking for :)

    Jim
    ”

    I don’t understand anything of such things, so I asked our hostingprovider. He told me that the server configuration is the same as for our other site. For these sites your tool works perfect!
    Jim, my English is bad, my computer-English even more bad and i have no idea how to solve this problem. Is it something easy? Do I need professional help?
    Thanks!
    Eva

  29. Jim says:

    Hi Eva,

    There is something not right here – If I play with the useragent string, I get different results!

    For example, run the sitemap generator, but in the useragent field, put this:

    Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120427 Firefox/15.0a1

    Then run the generator. If your server / host is restricting traffic on useragent string, it could be hurting your site. Google uses different useragent strings to review your site, as do mobile devices. This was a cursory review but in that short time, that popped out.

    And yes, looking at the sitemap, times, links and more can tell you soooo much :) It can give you that advantage you are looking for :)

    Jim

  30. Eva says:

    Hi Jim, love your reports, especially reporting problems.
    Used it with pleasure, but it doesn’t crawl our website
    vakantie-oostenrijk .nl

    Problem: 403 at the very first file.
    I don’t know how to solve this problem.
    Help please!
    Thank you very much,
    Eva

  31. Jim says:

    Hi Tom,

    Thank you for the donation!

    I’ve looked at your site and generated a sitemap for you :) I also noticed a number of issues that may be preventing you from ranking higher:) For example, you’ll notice that there are a large number of pages without titles or descriptions! There are also a large number of pages that have duplicate titles!

    I see that most of the duplicate titles are sub pages of the main page which does have a title, so I’ve created a sitemap that includes that main page. Google will discover the sub-pages with duplicate titles, but should guess that what you have in the sitemap is the important page. It works for me :)

    I’m guessing that you can’t really get into the code and modify it for unique titles, etc, so this is your best option and will work for you.

    I also noticed 5 errors, one was your 20% off page. Not a big deal unless that 20% off page is being used!

    I’ve emailed the sitemap to you along with the xsl stylesheet. Simply place these in the root of your directory (you’ll be replacing the sitemap.xml file already there)

    Let me know when that’s complete and I’ll take another look.

    BTW: Here are the “Exclude url filters”:
    *.jpg
    *.Jpg
    *.JPG
    complex:.?.

    Regards,

    Jim

  32. Tom says:

    Jim,
    In addition to the other questions already asked, I tried to do a Bing account and submit the sitemap.xml to it but it said ‘not found 404’. So I read further about entering a code that ALL the robots will follow, but I am not sure where or how to put that info in my site.

    and yes, it is your sitemap that is referred to in the robots.txt.

    Tom

  33. Jim says:

    Tom, I see a sitemap in your robots.txt, is this the sitemap you created with my sitemap generator and also the sitemap you submitted to Google?

  34. Tom says:

    Jim, I pasted what I think is the address that you were asking for into the comment box. I see that it doesn’t show in the comments. Is that by design? Did you get the info? Tom

  35. Jim says:

    Hi Tom,

    Great to hear it is working for you! When asking for the location of the sitemap, I was referring to the name of the sitemap file itself and the location that it is stored on your server. When you told Google the address of the sitemap, it had to include the full website address and that was what I was looking for.

    There is a TON of information to be had from reviewing the competition’s using my tool :)

    There is no limit on the number of pages when using my xml sitemap tool. The other companies are in it for the profit and limiting it to 500 is simply a way for them to suck you in.

    The sitemap helps in a number of ways… By making search engines aware of your pages, especially deep linked ones, by discovering errors on your site, by finding flaws in your hosting company, by reviewing keywords and descriptions, by understanding the link structure and much more. To answer your question, yes, it will help make sure Google is aware of your pages. More pages can mean more traffic (depends on content).

    Best,

    Jim

  36. Tom says:

    Dear Jim,
    Good news! This AM I looked at google webmaster sitemap etc., in our account and the Bar graph shows 500 submitted and 481 indexed. :-) Interestingly, when I did the sitemap ‘test’, it still indicated that 395 pages were submitted. ???
    However, it looks as though we are off to a good start.
    I still have a question. With 2079 lines on the site map generator spread sheet, how many of those items are ‘pages’ that CAN/SHOULD be indexed? Prior to finding your generator, I had seen several generators that limit the # of pages to 300 or 500. You refer, in your copy, to large websites and how they are to use your generator. Are the 500 pages submitted from our site the result of a limit your generator imposes? (500?) Are there more pages that did not get submitted or is it a huge coincidence that our site has exactly 500 pages out of that 2079 lines on the spreadsheet. Is there away to look at the spreadsheet data to determine how many of those lines are ‘submittable pages’ ?
    In a practical vein, can we expect an improvement in traffic, searches and other related things?

    We are really excited about the progress so far and thank you very much.
    Tom :-)

  37. Tom says:

    I just went to the generator and put the url of one of my competitors in a new project and it ran just like mine did. Of course I did not ‘submit’ the spreadsheet to the sitemap creator, but I suppose that I could. I just couldn’t load it to their site since that does require password etc.
    Let me know what I need to do. Thanks, Tom

  38. Tom says:

    I am not sure that I understand what you mean by the ‘address’ of the site map. I uploaded it to our site and then with filezilla, to google webmaster tools. Since your sitemap tool didn’t request any password, would you not be able to create a sitemap just by entering our site address in the generator and then looking at the sitemap that it creates (in the spreadsheet)?
    Is there a way that I can recreate it and send it to you?
    Tom

  39. Jim says:

    Thanks for the address tom, but I’ll need the sitemap that goes with it. So, simply let me know the address of the sitemap you submitted to google.

  40. Jim says:

    Dear Tom,

    I can’t help without the location of your sitemap, once I see that, then I can tell you what is going on.

  41. Tom says:

    Dear Jim,
    It appears that your sitemap generator is just what I am looking for. I ran it a few times and after I deleted the errors found, I created it and uploaded it to our site. Then through google webmaster tools to optimization/sitemaps. It showed 500 pages submitted. The stats of our sitemap.xml showed 2093 lines in the spreadsheet, 14 failed which I deleted and 2079 processed.
    Here are my questions.
    When I tested the submission, it said that 395 pages were submitted, not 500. What’s with that?
    I do not know how many pages we should have submitted, but out of 2093 lines, I find it hard to imagine that there were exactly 500 and then reduced to 395.
    I submitted it to google on June 12 and as yet there are no indexed pages. Why is that?
    Immediately after I submitted it to google, I requested that google crawl the site right away.
    Can you help me with any of these issues?
    Thanks, Tom

  42. John Ashcroft says:

    Awesome, what a great video. This is just a great tool and perfectly explained.

    Many Thanks!

  43. Jim says:

    Hi Danny, I’ve updated the instructions with a graphic here:
    https://www.auditmypc.com/free-sitemap-generator.asp#sitemapxsl

    If you select the .xsl option, then you need to download the xsl zip file and place it in the same location as the sitemap file.

    I looked at the sitemap and it seems to be fine. I also checked the site and didn’t find any 301’2, 302’s, etc. What are some of the examples Google is showing errors on? What are the urls / addresses? With a few examples, I can check your server headers and compare to the sitemap file to find out what’s going on…

  44. Danny says:

    Hi Jim,

    Thanks for responding to my query.
    Oddly enough, the sitemap was accepted by Google, after resubmitting it and in spite of the fact that I was getting the same error, over and over again. The only difference was, that before my last submission, I cleared my browser cookies and cache, uploaded my sitemap again to web master and waited. It still showed an error but it cleared away…very odd. Only that now, there is an increase in 302 errors (92 links not followed) and a low number of indexed links (96 out of 904 submitted), which most likely has nothing to do with it.

    Thanks again,

    Danny

  45. Jim says:

    Hi Danny,

    Where is your sitemap located? What is the exact address that you gave Google and I’ll take a look. I won’t post the location from your comment, so no worries.

  46. Danny says:

    Great tool – however, I am getting the following error “Unsupported file format Your Sitemap does not appear to be in a supported format. Please ensure it meets our Sitemap guidelines and resubmit.” , upon uploading to my server and submitting to Google webmaster tools.

    Perhaps this will help: the following message shows when viewing site map in browser – “This XML file does not appear to have any style information associated with it. The document tree is shown below.”

    What am I doing wrong?

    Thanks,

    Danny

  47. Jim says:

    Hi Jim,

    I looked at your text file and it’s just that, a basic text file that has multiple entries like:
    https://www.auditmypc.com/free-sitemap-generator.asp

    The sitemap generator will not open an unlinked, non html file (.txt file), read it and follow unformatted urls this way.

    Link to it from a webpage, tell the htaccess file that .txt files are to be parsed as html / php, etc then format the links to a href and it will see them.

    Hope that helps.

  48. Jim says:

    Hi Gord,

    Try a different computer or narrow down the results if working on the same computer. The app does see sub domains, you simply need to select that option.

  49. Simon says:

    best tool i have found for doing site maps so far. thank you.

  50. Izaim says:

    U saved my life! This tool is awesome. No Comments!

  51. Gordon MacDonald says:

    a) The results do not save for my after about 70,000 pages are processed – is there anything I can do to get the entire website processed (approx 120,000 pages)
    b) the app does not seem to identify sub domains – does the app support the identification of sub domains?

    Thanks
    Gord

  52. Jim Mueller says:

    Awesome tool! Thank you.
    Question:

    Our CMS creates a raw url text file of each of our articles; it is on our server and *is* crawled by the SE’s.
    For some reason, your tool will *not* crawl this file.

    Above is a *.txt file — that should not be an issue, correct?
    I even created an “include url filter” for *.txt files . . .

    Thoughts? Thanks.
    Jim

  53. Tom Dale says:

    Think I found it. I’m using/hosted by Fortune3.com – not sure if it makes any difference… Found that by selecting the ‘Load from Directory’ instead of the ‘Load from Server’ suddenly it started working (and finding lots of things for me to ‘fix’) – Coffee on the way….

  54. Tom Dale says:

    Looks like a great tool, and the video was excellent. Trouble is, when I enter my site it only finds the main page. Saw comments about ‘copy & paste’ the url, but I can’t seem to get your url box to accept any ‘edit’ commands so I typed in exact url. It does find the main page, but nothing else…

  55. Jim says:

    Hi Debbie,

    You are using a screen resolution of 800×600 and the software needs a larger display area, so you’re only seeing a portion of the app. Increase your screen size and you’ll see it all :)

  56. Jim says:

    Thank you Susan! :)

  57. Susan says:

    Thanks for making the life of a marketing person who also has to administer some web sites A LOT EASIER! Enjoy your hot dog and Chai. Food is very important, although I’d probably do something chocolate with my chai.

  58. Debbie says:

    Hi, It seems that the video tutorial displays something that doesn’t exist on the tool. I need to exclude certain url’s with a specific string, and I don’t seem to have the textbox that appears in the video, but a rather a list of the different types of filters. How do I get to this box?

  59. Jim says:

    Thanks Ray, you made my day! :)

  60. Ray Cassidy says:

    Jim, you have gifted a really valuable tool to the community of webmasters. I’ve used it on and off for about 6 or 7 months and I really haven’t found much else to beat it. When things pick up I owe you several beers – never mind a chai. It appears to have run flawlessly on all the occasions that I have used it and the webmaster set up seems to accept the resulting sitemaps without question. Many many thanks.
    Ray

  61. Jim says:

    Hi Pete,

    Thanks for the heads up. I’m working on the site right now and I’ve changed the default resolution to 1024 – this should solve the problem. Let me know if you have further problems.

  62. Help on Starting Sitemap says:

    Jim,

    I am running IE 8 and just updated to Java 6. When I start Sitemap I can see the Sitemap console start up but then the console window immediately shrinks up to about 1/2 a character high and I can’t find a way to make it visible. Thoughts?

    Pete

  63. Jim says:

    Thanks for the Chai!

    I took a look at the site and spotted a number of 403’s – HTTP/1.1 403 Forbidden Content-Length: 312 Content-Type: text/html; charset=us-ascii Server: Microsoft-HTTPAPI/2.0 Connection: close

    I noticed they all these errors have ../ in their url, so that is where the problem is for the 403’s.

    As for the title, it’s happening because you are referring visitors to http even though you are redirecting (301) to https. You’ll definitely confuse the bots and likely mess with your rankings if these are not fixed.

    No worries, easy fix… Here is what you do…
    – Run the sitemap generator and stop it after about 2 minutes.
    – Click on the Sitemap Tab.
    – Look for the first occurrence of http (not https, http) (in my session, it was item Nr. 5)
    – Now, go over to the In-L Column, see the + in front of the 3? Click that and you’ll see all the links that make a call to that url. Those pages have a call to the http and need to be changed to https :)
    – You’ll need to repeat this (rerun the sitemap generator) until you fix them all. You could just run the generator until it’s finished with the site, but I think you’ll find that if you do it little by little, it will go faster in the end.

    Note: you need to start a new project each time to see the changed results, just clicking continue won’t allow the xml sitemap generator to see the changes you’ve made.

    Okay, so we have your ../ and your http vs https problems fixed. Here’s one other tip:

    You are referencing default.asp. Look for the url under the sitemap tab and click on the +, those are the files that are calling default.asp – make them call the main url less default.asp. You should not be seeing default.asp

    That should clean your site up nicely :) I just finished my cup of chai so I’m off – have a great day Michael!

  64. michael says:

    thanks Jim

    Link as follows:
    I also should add that the robots.txt has the following line
    Disallow: /shop/

    It’s https and stampsforsale,co,uk

    There is a slight inconsistency in the website at the moment. I have moved it to new hosting and it is running as HTTPS: but there are still links with HTTP: that need to be weeded out. Perhaps that is the problem?
    ——-
    re:
    I need the url to your site and the url to one of the failed pages. The generator weeds our a lot of problems webmasters would have never known of before, such as hosts limiting traffic (do that to a google bot and you’ll never rank high!), database problems and more. I’m not saying this is your problem, but once I see the url, we’ll know for sure :)

    enjoy your chai

  65. Jim says:

    Hi Michael,

    I need the url to your site and the url to one of the failed pages. The generator weeds our a lot of problems webmasters would have never known of before, such as hosts limiting traffic (do that to a google bot and you’ll never rank high!), database problems and more. I’m not saying this is your problem, but once I see the url, we’ll know for sure :)

    And Gabriel,
    301’s don’t happen by accident, you or someone else must have modified the headers, htaccess or another file.

  66. Gabriel says:

    Hello Jim,

    I have an issue with my site, when trying to build a sitemap with your awesome generator many of the urls get the title 301 Moved, is there a problem? How could I fix it? Thanks in advance!

    All the best!
    Gabriel

  67. michael says:

    hi
    I am getting a block of links coming up with a 500 error. There are hundreds of similar links which do not fail. If I open the failed url it displays fine. I have looked in the source of the generated page and it shows no error message. How do I find what is wrong?
    michael

  68. Jim says:

    Hi Beau,

    Just now checking emails and noticed your comment – I’m way behind and will look at this tomorrow, but in the meantime, what are the filters that you’re using?

    Thanks for the Chai by the way, I greatly appreciate that!

    Jim

    Update: I’ve looked at your site and if I were you, I would start with these filters:
    */Facebook_Gigs/*
    */News/*
    *.jpg

    Work on cleaning those up, then remove the news filter, work on that and so on. I noticed that many of the news pages (which you have a LOT of), have direct links out (and more than one). With Google, you ARE who you LINK to, so don’t hurt yourself (rel nofollow perhaps).

    Hope that helps!

  69. Beau Webber says:

    Hi, looks like a great tool (just bought you a cup of chai).
    I am trying to crawl KentFolk and am running out of memory.
    It has found 40k pages when it hangs, memory errors starting at about 20k crawled, stops at 25k crawled.
    I have 8GB ram, and have followed the instructions for increasing Java memory, have tried 256m and 2048m.
    System still shows :
    Memory usage : Free memory 4.03 M, Total memory 15.5 M, Max memory 247 M
    I am using Google Chrome, and have also tried Firefox (IE does not run, assume Java is blocked).
    I have re-started the browser, I have re-booted the PC.
    Any ideas ? cheers, Beau

  70. Jim says:

    Thank you very much Johnnie and glad you like my sitemap tool :)

  71. Jim says:

    Hey Tom, read the FAQ’s and you’ll see your answer.

  72. Johnnie says:

    This is the best sitemap generator out there! I tried so many before this one and this kicks arse! The fact that you can filter has solved a long standing problem. Thank you author!!!!

  73. Tom Williams says:

    I very much want to try your SiteMap creator however it does NOT appear as shown in your video. I first see your warning about Sessions and the Cup of Tea, I click OK and nothing appears. I’ve tried at least 3 times.

    Please advise, I would like to get started right away.

  74. Albee says:

    Hi Jim,

    You’re not going to believe this! My website did have duplication from the hosting company, Hostway.

    Since they purchased Valueweb in 2009, the traditional folder was the “web” folder. It made “siblings” to the “www” and “public_htmL” which resulted in files being placed in all of them upon updates! I had them turn it off.

    Then I discovered I had front page extensions turned on even though I dont publish with it. So all those folders “_vti_Garbage…” were removed as well.

    I had them make a minor revision to my htaccess with the domainaddress to www.domainaddress “redirect.”

    I never knew that upon the merger of the company’s that this would result in severe duplication/triplication of files.

    I’m now getting more reasonable results with your tool. Before I was getting 35k files! Now Im getting something closer to my real count. Currently its coming in with 8000 files. There are some slight duplicates but the weird thing is that they are “queued.” Also there’s a folder that shows some repetition but when I go to it, I don’t see it. Any suggestions?

  75. Jim says:

    Hi Matt,

    All that I do on this site is 100% free and I don’t have time to create filters for visitors, sorry, but I’ve got great instructions and examples that you can learn from.

    Glad to hear the hint helped :)

    I meant search Google and see what sites rank high, then look at how they are set up and modify your site in a similar way – your xml sitemap should then require little tweaking :)

    Best,

    Jim

  76. Matt says:

    Jim,

    Thanks for an ultra prompt response. Yes indeed, we use phpBB, you are correct. As for duplicate titles, well, I’m not sure it just one reason, but there are many multipages thread, so each page has the same title, and I’m afraid that’s not something that can be avoided. As for meta descriptions, well, I was also disappointed when I saw them missing, but it wasn’t me who was setting up the forum. :) I’ll see what can be done.

    Could you please tell me how to filter out SID? Actually, I set up many exclusions while building the sitemap, to filter out unneeded pages, but I can’t think of an exclusion that would weed out SID, especially that it’s different every time. A regular expression? I need a little help on this please.

    I was also wondering, how deep I should allow the crawler go? The forum’s been running not even for three months and I find it hardly believable that there could be over 1ok pages, but this were the numbers popping up at level 6; is it possible that it was fetching duplicate sites?

    Thanks for the hint on the soap, that’s a good one, actually. :)

    “What I would do. Look up a term that you know would be hosted on phpBB, then look at their setup, titles, description tags and see how they are configured. Perhaps they’ll tell you what tweaks they made to get it that way?”

    I’m apparently missing something here, look up on Google, or what? What will tell me about the tweaks? Sorry, completely not following you on this one. :)

    regards,
    Matt

  77. Jim says:

    Hi Matt,

    Looks like you’re using phpBB, correct? If you run the sitemap generator and look at your titles, you’ll see a ton of duplicates – these NEED to be fixed. As for removing the multiple indexes, SID’s, etc – you’ll need to create a filter within my tool. I also see meta description tags that you also NEED. I’m not trying to sound negative, but if you want those internal pages to do well, you’ll need to address these issues.

    What I would do. Look up a term that you know would be hosted on phpBB, then look at their setup, titles, description tags and see how they are configured. Perhaps they’ll tell you what tweaks they made to get it that way?

    By the way, I see the term artisan shaving soap seems to be hot, not sure if you knew that, but I’d write an article on it if I were you, I’m positive it would bring visitors :)

    Best,

    Jim

  78. Matt says:

    Okay, I’m sorry, from what I read it’s not your program’s fault, it’s that the server should be configured so it won’t serve SID to crawlers.

    regards,
    Matt

  79. Matt says:

    Hello, I’ve got the same problem as Sergey. SID is placed everywhere, when I want to build a sitemap for a forum at artisanshaving org. Any hints on that?

    I had 59 instances of index.php with different SIDs in the sitemap. That’s a total mess, are there any solutions?

    regards,
    Matt

  80. Wes says:

    Jim, was your response dated January 5, 2012 at 7:45 pm a response to me? Thanks!

  81. Albee says:

    Hi Jim,
    Thanks for writing back. I dont know what to do… Most of my pages are just basic html published manually uploaded. I dont have any fancy shmancy programming.
    What do I do? My site has been around for over 10 years and havent been getting decent ranking.

    I did put the site map up and googled accepted 400 of the 900 pages in the sitemap.
    Thx

    Those extra slashes you see are a bug in your programming and could result in poor ranking as the search engine spider could enter an endless loop or / and duplicate content. This should be your #1 priority!

  82. Jim says:

    Got it Jesse,

    If you open your project file, you’ll notice they difference are all urls that don’t have a title. Also, you have a lot of duplicates, so you’ll want to fix those or it will hurt your ranking!

  83. Jim says:

    Hi Gin,

    I need a url to look.

  84. Jim says:

    Hi Albee,

    Those extra slashes you see are a bug in your programming and could result in poor ranking as the search engine spider could enter an endless loop or / and duplicate content. This should be your #1 priority!

    If you published the pages and then run my sitemap tool (make sure you start a new project), and you pages are still not showing up, then you didn’t publish them. If they are there and linked to from your website, my tool will see them.

  85. Jim says:

    Hi Wes,

    Yes, simply create a new project (click project, new). You’ll want to copy any filters you have set up as they are wiped in a new project.

  86. Jim says:

    Hi Cliff,

    My sitemap generator / webmaster tool is 100% free – there is no limit on the number of pages your website can have (That just wrong). In fact, every tool I have on this site is free of charge, so enjoy. Now, I’m not beyond pointing out that if you find it useful, buying me a cup of chai ;)

  87. albee says:

    Hi there. Thanks for this tool.
    I published a few new pages a week ago and its not showing up at all in the sitemap. How long does it take for it to appear when I run this utility?

    Also I have tons of entries in the sitmap that I have to clean up manually such as ..//////////file1.php and the the same ones with fewer brackets and same file name. Is there a way to avoid those?

  88. Wes says:

    Hi Jim,

    I have a question for you. I find your tool very useful, but it seems to use server-side caching and so I can’t make changes to a website and immediately test those changes. Is there some way for the user to clear this cache? Thank you for the tool.

  89. Cliff Imasuen says:

    Please I would like to add a sitemap to my site but the sites I have seen have limited pages to be indexed. Can your tool build an xml file for all my pages without charges? I will appreciate a quick response. Thank you

  90. Jim says:

    Hi Marshall,

    I can’t do a thing to help you without information, such as a screenshot, settings you used and any filters – once I have these, I can then help you create an xml sitemap.

  91. marshall says:

    Why there are many records in the results with a different domain than the one I specified? I checked “load from directory” option.
    Have I to specify something else?
    Thank U
    M.

  92. Jim says:

    What version of Java are you running Dan? Google “Verify Java Version” to find out and make sure it’s the latest before running the web tool.

  93. Dan says:

    I seem to have a problem opening up the site map gemerator. I get the links accross the top and a red X in the upper left corner.
    after refreshing the screen numberous times the java link starts and opens. upon returning to the site, now it does not want to open up.
    any help???

    Thanks

  94. Gin says:

    Hi. Looks like your sitemap generator ignore robots.txt “Disallow” and your option “exclude images” not working.
    For me collected all website with images and what was “disallow”.

    Regards.

  95. maemae says:

    Never mind! I figured it out. My landing page has only one main link to the rest of the site (the others were images and a css file) and somehow (not sure why) my html editor has added the full url (http etc) to it, so I guess your program thought it was an OBL. Thanks!

  96. maemae says:

    Hi! I used your tool as you showed in the video and everything was working great. I fixed a broken link and went to restart and now for some reason I am only being shown the 4 links on the index page? I have tried using a backslash after the site name, refreshing the page and restarting my browser but with no success? What am I doing wrong?

  97. ade says:

    Hi Jim,

    Thanks for the quick reply. I did a sanity check before sending the project file – seems my sanity needs a little work :) Sorry to bother you and keep up the good work!

    Best Regards,

    Ade

  98. Ade says:

    Hi Jim,

    Thanks for providing this great tool, I’ve been using it for a couple of years now without a hitch. However, I have a little problem now where it doesn’t seem to pick up absolute links within our site – probably something obvious, but I can’t see how to fix this.

    Thanks

    Ade

  99. Jim says:

    Hi Zotic,

    xml sitemap mimetype problemThanks for the link to the project file.

    If you load that project file, then click on the sitemap tab, sort by encoding, you’ll see that at some point, your server dished out different output than the other pages, such as no titles and zero content length.

    My xml sitemap program takes this into account when you export your xml sitemap and excludes these files. I’m not sure why your server did this, perhaps add a delay and see if it happens again and with the same files; if so, then there will be a pattern and we can go from there. Something’s up and the generator is seeing it, then so are the bots.

    I’ve included an image above so that you can see what I’m referring to…

  100. Zotic says:

    Hi Jim,

    First of all, this is a great tool! Now, I have the same problem as Jesse.

    “In the tool, my sitemap shows 15,000 url’s but when I export as a sitemap I only get 7973 of them. If I export in any other manner, txt, csv, etc I get them all but sitemap does export them all.”

    Here is my saved project [snip]

    Please help me with this.

    Best regards!
    ———
    zotic

  101. Jim says:

    Hi Jesse,

    Send me your saved project file and I’ll take a look.

  102. Jim says:

    Hi Sergey,

    A section of your robots.txt file below:

    Disallow: *&feed=RSS_0.91*
    Disallow: *&feed=RSS_1.0*
    Disallow: *&feed=ATOM*
    Disallow: *&feed=JAVASCRIPT*
    Disallow: /about/partners/?area2=
    Disallow: /photos/?act=form
    Disallow: /photos/?author=
    Disallow: /photos/?year=2010
    Disallow: /photos/?year=2011
    Disallow: /articles/?author=
    Disallow: /articles/?year=2010
    Disallow: /articles/?year=2011
    Disallow: */?from=0
    Disallow: *&start=0

    Try analyzing your robots.txt file first :) See, already, my sitemap generator has helped you :)

    The sitemap generator respects the robots.txt file just fine. There is only one session id in your email below, and it’s the session id your site assigns to any visitor. Filters will help you here.

    I don’t have time to spend reviewing the entire site, but I can tell you it’s nothing to do with the generator. Knowing that, you can focus in on the real problem and perhaps see a nice increase in your ranking!

    Best regards,

    Jim

  103. Sergey says:

    Hi Jim!

    Thanks a lot for your site Audit My PC!
    When I searched tool for sitemap generation I was glad to find and test it.

    Unfortunately I found two problems:
    1). Rule “Respect robots.txt file” doesn’t work.
    All rules are ignored.
    But if I write them into Exclude content type filters that works.
    My robots.txt & Exclude content type filters are in attachment.

    2). But even worse that this system takes SID (session ID) from nowhere.
    I respect your WARNING: This webtool respects sessions so make sure you log out
    of your website BEFORE you run the sitemap generator!

    If you check my site adrionik. ru with depth level = 2,
    you’ll see in sitemap strings like
    /forum/viewforum.php?f=4&sid=86971d4f3fb55f4d1b309760c5d39f80
    /forum/viewtopic.php?f=38&p=3838&sid=86971d4f3fb55f4d1b309760c5d39f80

    Moreover with each new launch service of sitemap generations lines with “&sid=”
    have a different number of SID!
    So I couldn’t wait for the generation of maps for 2 hours, when the number of rows
    in the map exceeded 10 000.

    May be it’s my fault but I can’t solve it myself.
    So I try to find other tool for sitemap generation and analisys.

    I would appreciate an answer.

    WBR,
    Sergey

  104. Jesse says:

    @Jim – the website is homes.anglerealestate com, i am using the sitemap page as a start point at /idx/9787/sitemap.php – again, just so you don’t have to search for my issue, “In the tool, my sitemap shows nearly 20,000 url’s but when I export as a sitemap I only get 4650 of them. If I export in any other manner, txt, csv, etc I get them all but sitemap does export them all.”

  105. sareeoutlet says:

    hi all of you I think that this is the best tool for xml sitemap. I am not a programmer but i make a sitemap for my website very easily, thank you.

  106. Dave says:

    Excellent tool guys. One recommendation….make the link to the tool more obvious…I spent ages wondering how to launch it, I thought the link to it was an advert and ignored it!

  107. Andriy says:

    Hello, Jim.

    Thanks for such a nice tool! I’ve been using it a lot for my sites.
    Have a problem with some of them though. The XML Sitemap Tool fails to detect the UTF-8 character encoding and makes the titles with strange characters rather than the Chinese hieroglyphs. The charset is set correctly and Firefox detects it as “text/html; charset=UTF-8”, while running the URL check in your tool gives “text/html”. Any reason for such behavior? What can I do to fix it?

  108. adrian says:

    I am running asp.net, I put my homepage url there, it can only find image folder and also not listed all images within that folder and my main page itself. I used copy and paste method from my home page although I know well mine is not a redirect page but none work.

  109. Jim says:

    Hi Tish,

    I just checked it out myself, and Google Chrome will issue a message if your Java Plugin is out of date. Here is a screenshot of what I received and fixed it by simply downloading the latest version of Java.

    Update to the latest version of java and you’ll be all set, or, use Firefox.

  110. Tish Garcia says:

    In Google Chrome:
    I see these links at the top: AuditMyPC.com | Sitemap Generator page has instructions. Protect your privacy with Anonymous Surfing! If this helps you, buy me a cup of Chai Tea.

    Then the message in the middle:
    Missing Plug-in

    Screen is blank otherwise. In FF screen is completely blank with the exception of the top links. No error message, but no content either. It just occured to me, do you think this is a blocked pop-up issue?

  111. Jim says:

    Hi Tish,

    Can you provide me the exact message you are receiving, thanks!

  112. Tish Garcia says:

    I am trying to use the Sitemap Generator tool, but got a message saying I am missing a plugin for it. Can you tell me what the plugin is, or better yet send me a link to download the plugin I need? Thanks!

  113. nicki says:

    Hi im running a web site promoting Amazons products. I haven’t got a clue how to add site maps of if they add them for me. on my site all i have is update xml site map. On the web tools it says add site map but i do not have a clue how to find my site map. I am new to this and have never done it before. Would be grateful of any info thanx

  114. Jim says:

    Not all files are placed in a sitemap, such as css files, icons, etc. I need a website address, your exclude files and details in order to help you out Jesse.

  115. Jim says:

    Hi Alpesh,

    I don’t support other sitemap generators, and a wordpress XML generator plugin is limited on what you really see as it reads from the database only. My suggestion would be to contact the company that made the app you’re using and get support from them.

  116. Jim says:

    Thank you very much DNS, nice to hear that :)

    I’ve improved this tool since then and added some cool features, including the ability to search your site for hidden links out, such as in comment spam and more… Just need to make time to upload it to the site, so stay tuned!

  117. Leslie Geary says:

    Why does the generator list modify date for pdfs and images but not html files?

  118. dns says:

    Dude I am new to SEO stuff since I just began using PHP myself, but I must say this Java app you made is KICK ASS

  119. Alpesh says:

    Hi, I have a problem with my site it is that when i publish a new post i only see my home page URL indexed for that particular post keyword search and not the post page URL. my site is new. I am searching for solution since many days but haven’t find exact solution. If you can suggest me some solution i will appreciate it. My site is on WordPress and using a XML generator plugin.

    Regards, Alpesh

  120. Jesse says:

    In the tool, my sitemap shows nearly 20,000 url’s but when I export as a sitemap I only get 4650 of them. If I export in any other manner, txt, csv, etc I get them all but sitemap does export them all.

  121. Linda Greene says:

    Hi-

    Can you please tell us if we will have to choose the hierarchy levels ourselves or your tool can do it itself? Also, we want to leave some pages of the site from the indexing, is there anyway we can do it by remote.txt ?

    Btw, our website is dialavacationrental! We are a vacation rentals directory so we are in need to be directed right in the direction of sitemap.

    Thanks in advance!

    —
    Regards

  122. Robert (Popeye) says:

    Thank you sir for this great tool!

    I have been struggling to find out why my dynamically generated web site is not getting any Google ranking and thanks to your site map generate I can see exactly why (won’t ‘spider’) and test different approaches to getting it corrected.

    Thanks again, just an unbelievably cool resource.

    Regards,
    Robert

  123. Marc says:

    Hi Jim, actually reread your response to Bo, and realized that I hadn’t included the “home.php” in my search. Thanks for a great tool.
    Marc

  124. Terry says:

    I can thank you enough for these great tools and tips. My simple Arcteryx Backpack niche website has definitely show an improvement in ranking since using your tools.

    Regards,

    Terry

  125. aira says:

    Oh well, I really have problems with sitemap because in the first place, I don’t know how to do it.

    I’ve got hundreds of niche websites and am making a killing with adsense at it! For example, my site iamdavie is a dot com that targets credit cards and also speaks the text on the page to the visitor to suck them in. Huge CTR on this, but the thing is, I’m having trouble making an xml sitemap. Where do I start?

  126. Marc says:

    Hi. I can’t get your generator to give me anything beyond the home page casala org, any ideas? Thanks

  127. Jim says:

    Jeff, what are the filters you are using?

    In my instructions I explain how to create a sitemap with the most important step being to let it run run on your site for about 5 minutes, then stopping it and reviewing the urls it found.

    4dmv xml sitemapWhen you do this, you see patterns that emerge which you can use to exclude unwanted pages (such as css files, duplicate content, search form results, etc).

    This is also a VERY IMPORTANT part of the SEO process as well as detecting any xml sitemap errors that would slow your site down for search engines. In your case, you have invalid files being indexed, php code exposed in your urls and more. In the image to your right you’ll see a screen shot of what I found within 2 minutes (click on the image to see the full shot).

    Once you have those errors corrected (which by the way would give your competition the advantage) and compiled a list of unnecessary files, you’ll find the process goes pretty fast. Plus, you’ll know that your site is error free! PS – not an error, but it jumped out at me, is the privacy page.

  128. Jeff Kellner says:

    Jim

    4dmv .com, it stops at about 17k pages.

  129. Jim says:

    What is the site you’re trying to build an xml sitemap for?

  130. Jeff Kellner says:

    Jim

    Can you please help me. I have a site 15k pages and I get 20 min into the map making it starts to slow down really bad. and never really stops

    thank you
    Jeff

  131. Jim says:

    Glad to hear it’s working Bo!

    I simply help those using the sitemap generator get up and running and don’t have time now for SEO work.

    Good luck,

    Jim

  132. Bo says:

    Thank you Jim, It’s working now.
    Jim will you be available if I need SEO assistance? Send me an email please.

    Regards
    Bo

  133. Jim says:

    Visit your websites main page using your browser. Once you are on that page, copy the full website address shown in the address bar and paste that into the sitemap generator as the website address that you want to create a sitemap for. I did this and it worked fine.

  134. Bo says:

    Thank you for your detailed explanation Jim. I corrected the problem you indicated, but I now I get nothing! it only shows 1 processed.

    Actually I get the same 301 page. By the way thank you for your compliment. Regards Bo

    Regards
    Bo

  135. Jim says:

    Got it, and I see your problem(s)…

    Take a look at this image, it’s the URL Check tool inside the sitemap generator that is overlooked by most but can help you improve your ranking by showing you what the search engines see.

    Hire-Safe.com error

    You’ll notice that when search engines visit hire-safe they are told that hire-safe doesn’t exist and redirected to hiresafe (no problem there), so they visit hiresafe, but now are told that it doesn’t exist, and sent to www.hiresafe. com/background-check.aspx. However, when a search engine bot ends up on that page (that’s what you are telling the bots by the way), they find your logo (sweet looking – kudo’s on that, in fact, the whole site looks great!), the logo takes them to a different main page which is now now called /background-check.aspx.

    This confusion and duplicate content is guaranteed to hurt your ranking and give your competition the advantage, especially in such a competitive market!

    The sitemap stopped because you entered a redirect for the start page of the website and it needs a correct start page. So, if you were to continue on like it is (I don’t recommend that!), then you would copy and paste the url that shows up when you visit your website.

    Good luck!

  136. Bo says:

    Old domain is: hire-safe. com and the new domain is www.hiresafe .com

    Thanks for being so prompt

  137. Jim says:

    Hi Bo,

    I need the website address to be able to help you.

  138. Bo says:

    Hi Jim,
    I recently made a 301 redirect to a new domain. I tried to use your tool to generate the sitemap of the new domain, but it only created the home page reading it as 301 page! I have over 700 pages in the new domain!

    Please advise.

    Thank you very much for this great tool, hopefully I can make use of it.

    Regards
    Bo

  139. Alex says:

    Hi Jim.

    Thanks again for your great app.

    Here are some suggestions to make your sitemap generator even more better.

    1. If some page has a canonical link that differs from the page url may be it’s a good idea to skip non-canonical links and include only canonical links to sitemap.
    Google and some other search engines support canonical links and more web-sites provide information about canonical links.
    2. Is it possible to include page titles to sitemap?

  140. Jim says:

    Hi Sid,

    Easy fix, and will likely help your rankings as well (Pays to ask right : ) – If I disable javascript in my browser, your site shows no navigation at all; my sitemap generator does not follow javascript links.

    Google will follow javascript links, but other search engines may not so play it safe and put the navigation links on the site in html – you WILL be glad you did :)

    Best regards,

    Jim

  141. Jim says:

    The xml sitemap generator works find Tina. If you read the comments, you’ll see that you are not running Java or your version is incorrect. I also have in those comments instructions on how to fix that.

  142. Tina says:

    Your sitemap generator page doesn’t work. Nothing but a string of anchor text comes across the top like a menu system but there is no program of any sort, just anchor text links. Please email me when you get it fixed.

  143. Sid says:

    Great info on site. I have a problem on Sitemap builder, I must be doing something wrong. The crawler will only crawl my index page all other pages are ignored, when doing a URL Check for other pages I get a 404 message. I have the latest Java installed for win7. My site is sidneyharbour.btinternet. co.uk any suggestion would be appreciated.
    Best regards,
    Sid.

  144. Benton says:

    Hello! Don’t you use Facebook? I’d like to follow you and the sitemap generator if that would be alright. I sure am undoubtedly taking pleasure in your blog site and expect new blogposts.

  145. Jeff Davis says:

    Attempting to reload the project gave an error. This was a walk of an internal SharePoint site where I am an admin so can’t share links, sorry.
    Message was “java.lang.NumberFormatException: Invalid character ” in base 64 string”
    I only have 11,000 links but didn’t want to take the 45 minutes for the sitemap generator to re-walk the site, wanted to analyze some of the failed links.
    Since I saved every export option possible I can go to one of the other formats and MANUALLY follow the links that failed, but wanted to use the tool itself.
    Despite this glitch I think this version of the tool EXTRAORDINARY and massively helpful.
    Will hoist a cup of Chai in your honor, buy you one if my boat is in port when you’re in town.
    Regards,

  146. Sartaj bedi says:

    Hi,
    I think you offer a great service, appreciate the video.

    I ran to generate the sitemap. I found some 404 errors. I copies them on the url and they work. Why does the sitemap generator show a 404 error?

    outoftheblue. in/“shree-ganeshya”-by-balaji-bhange.

    also, the following url has failed. But it cd never exists.

    outoftheblue. in/categories/menu/categories/menu/food/sizzler.html

    The correct format is
    outoftheblue. in/categories/menu/food/sizzler.html

    Please advise.

    Regards,
    Sartaj

  147. Jim says:

    Hi Gez,

    I ran the webtool on the site in your email and it respected your robots.txt file fine. I did notice that you don’t have the /component/mailto/ blocked from the robots.txt file, which you want to do – it’s showing up in the url list.

    The robots.txt us referenced because that is exactly what the sitemap generator does, refers to that exact file.

  148. Gez says:

    Looks like a great tool, Jim, but I’m having a couple of issues…

    I have checked “Respect meta tag” but urls with still appear in the sitemap so this instruction is not being followed. (“Incidentally, why use meta name = “robots.txt” instead of the correct meta name = “robots”?”)

    Gez

  149. Rohit says:

    Hi to all.

    While I am searching the number of cached pages of my website in Google I am seeing a different type of URL start with * (Yes, star mark) like that .exampledomain. com but I am not getting why its showing and how I can remove it. I thought there would be some problem related with my XML site map that why asking here.
    If any one know the solution to remove that url, please let me know.

    Thanks,
    Rohit

  150. Wayne says:

    Hi Jim
    Have been using this tool for a while. Its been great but recently have had problems with it. Lately only my mobile website is being indexed, its skipping all other files. There is a php script at top of each page that looks at client and if its a mobile client redirects the page to the mobile site. Could this script be the reason your tool will not index my website? It all used to play fine until recently
    cheers.

  151. alex says:

    I think the problem was with java, I installed the last version and now it’s ok.
    What is the way to increase memory for java? I tried to do it through the control panel “-Xmx300m” but these settings are not saved there. When I open the settings panel again they are not there.

  152. Jim says:

    Thank you very much kind sir :)

  153. BTG says:

    This sitemap tool is excellent. It blows any other free tool I have found completely out of the water. Thank you!

  154. Jim says:

    Hi ALex,

    What version of Java are you running? Also, what OS, browser version and does this happen with other sites?
    Test java here: java.com/en/download/testjava.jsp

  155. alex says:

    When I try to export into xml sitemap I am getting the following error:
    cannot find file: filename=resources/client/reportGoogleSitemapPrefix.txt

    And another thing when I am trying to open saved project I am getting error
    java.lang.NullPointerException

    please advise

  156. Jim says:

    Hi David,

    Nothing has changed on my end, everything works fine, tested from multiple pc’s and browsers.

    Try a different computer and let me know.

  157. David Moore says:

    Hi,

    Great Tools, but for the last few days I have not ben able to use the Site Map Generator – the links just goes round in circles.

    Something wrong somewhere.

    Best wishes,

    David

  158. Razibul Hasan says:

    Thanks . It works great. Ah, for the optional frequency tag of the XML site map, I am not sure how it is handles with the application ???

  159. Hervé says:

    No error
    I have not tested with google with this “very small” problem
    I have write my own tool (php) for delete this lastmod, but maybe it is not necessary ?

    Thank’s

  160. Jim says:

    Hey Herve,

    I just now looked at the sitemap.xml file and see the lastmod field under references to files like mp3. If the server reports a lastmod field, it is included, if not, then you won’t see it. either way, it should be fine. Did you have a problem submitting your sitemap?

  161. Josie says:

    Hi Jim,

    Please ignore my previous comment as I have now sorted the errors.

    Excellent tool to use, much appreciated.

    Josie

  162. Josie says:

    Hi Jim,

    Thanks for your reply. I’ve had a look at these links, they work fine and I still want to keep them on the site. I don’t understand why the server would not want the pages to be accessed. Is the way to solve the problem without removing the links please?

    Thanks again,

    Josie

  163. Jim says:

    Herve, I’ve been on vacation but took the time to generate the sitemap, just have not looked at it yet. Will try to do that today or tomorrow.

  164. Jim says:

    Hi Josie,

    A 403 comes from a link that you have on your website that points to another page on your website or media that your server doesn’t allow them to access. In the xml sitemap tutorial at the top of this page, I show how to click the + key next to the 404 / 403 error to find out what page the link is on. Once you have that page, you can search for and remove the link if you like.

    It’s an error that would come up regardless of who followed the link, so you’ll want to fix this – the sitemap generator does not create such errors, only reports them.

  165. Jim says:

    Hi Ray, should be no problem at all – I have not yet found an OS that the sitemap generator doesn’t work with :)

  166. Ray says:

    I have a Mac and wanted to know if I can use this for Lion OS? I really don’t want to use bootcamp and run my mac as a PC unless I really really have to ??

  167. Josie says:

    Hi Jim,

    When I run the tool I get the following error messages for some of my urls: ‘403 forbidden’.
    I don’t really understand what this means and am not sure how to fix it?

    Any help would be great.

    Thanks,

    Josie

  168. Hervé says:

    Hello Jim
    here is the exclude url filters :
    *search*
    *frm=*
    *image.php*
    *about*
    */template/*
    */img/*
    */logo/*
    *mosaic.jpg
    */thumbnail/*
    */puce/*
    *password.php
    *slideshow*
    *popuphelp*
    */cartes/*
    */mini/*
    */images/*
    *.gif
    *.ico
    *.png
    *.css
    *.txt
    *.kmz
    *start=0*
    *start=30*
    *start=40*
    *start=50*
    *.xml
    *sitemap*
    *plan-du-site*
    *video2*
    *?pg=*
    */img-dressage-chien.php*
    *register.php*
    *identification.php*
    *feed.php*
    complex:start.*start

    no other things
    Merci

  169. Jim says:

    Hey Herve,

    What are the filters you used when you created the sitemap that had only xml tags?

  170. Jim says:

    Hey Andrew, what is the site’s address that you were working on? You have a bug, not the generator, and I can tell you how to fix this!

  171. Andrew says:

    How do you get your sitemap generator to stop listing the same pages over and over again? When i created the html version the file was huge and could see that the generator had listed the same page lots of times.

    Andrew

  172. Hervé says:

    Hello Jim

    The website with the .mp3 is educador.fr

    Encore merci

  173. Jim says:

    Hello Herve,

    Jim cela gratuitement, son temps est limité et il ne le mieux qu’il peut:)

    And I’ll look into this. What is the website again?

    Thanks!

  174. Hervé says:

    Hi
    je vais finir par croire que Jim ne souhaite pas répondre à mes questions…

    And my question about .mp3 ?

    :o)

  175. Jim says:

    Hey Martin, look under common problems on the sitemap page where the link is, it’s the first problem and solution you want :)

  176. Jim says:

    Hi Annie,

    I need a website address in order to help you.

  177. Martin. says:

    Hi, the link doesn’t seem to work for me and I’m trying to build a sitemap for my website Accomx.

  178. Hillary Anne says:

    just wondering why you never answered my question that I posted back in June and you have removed the most recent one. I was trying to get some assistance for my second website.

    thanks anyway.

  179. Hervé says:

    Hello Jim

    I have posted a question last July (25) and i have solved my problem.

    question :
    “I would like exclude URL who have 2 x “start=” in the parameters
    -example.php?john=1&start=10&doe=doe&start=20”

    answer : “complex:start.*start”

    I have another problem with the generator :

    In a website, I have links with many files “*.mp3” (personnal creation files !)
    When I generate the sitemap.xml, for these files (and only these files) the xml tags “lastmod” “/lastmod” are in the sitemap… Is it a bug ?

    Thank’s

  180. Jim says:

    Hi Julia,

    The sitemap generator works perfectly with WordPress, and in fact, it can discover problems you can’t uncover with a WordPress sitemap plugin. It works with anything, but there are a few very old and poorly programmed content management systems out there that don’t verify actions. What does this mean? Say that you are logged into a site’s admin section, and in that admin section there is a link to delete a page that when clicked would delete the page without asking you if you’re sure; well, if you were to run the sitemap generator on that site while logged in using the same browser, and if there were links to the admin section, and should the spider follow those links, the page may end up being deleted. This is extremely rare and the possibility of you using such an antiquated CMS is unlikely, but I mention it just to play it safe.

    The real concern for logging out of the admin section before running the sitemap generator is that it could follow any links to your admin section and include those in the sitemap, which you don’t need.

    So, simply log out of any admin section of your site before you run the sitemap generator.

  181. Annie says:

    Hi Jim, just wondering why your sitemap generator only found images. I have a video on my index page…could this be the reason? and also it messed up all the images in my site.

  182. Julia says:

    Hi,
    This looks like a great tool and I want to use it, but I’d like some clarification on the admin log-in possibility of deleting links. Does this apply to Dreamweaver and WordPress? Does the generator actually delete links or pages? Surely not.
    I’ve been testing sitemap generators for a while and I want to write a post on the best one, so if you could answer this dumb question, I can get started!

  183. Riki says:

    Hello. I have a phpbb3 board. I would like to use Your sitemap generator but I need to exclude pages like: Login, user control panel. admin panel images etc. Could You write me filtters what I need to exclude please? Thank You

  184. Jim says:

    No problem Jen, glad you solved the problem (I love it when that happens : )

    Enjoy the day!

  185. Jen says:

    Nevermind! The project wasn’t loading because I was accidentally trying to open an exported session file instead of the project XML. Next, I read further up on this page where you use URL filters and not content filters to get rid of CSS files, so I’ll do that instead. (Same with my MPEGs.)

    PNGs and GIFs still show up in my sitemap when I choose to “Exclude Images,” but I can work around it by also adding their file extensions to the URL filters.

    Thanks so much for your quick response, Jim!

  186. Jim says:

    Hey Jen,

    I need the website and examples of the mimetype filters you’re using to generate the xml sitemap.

  187. Jen says:

    Awesome tool, but I’m encountering two problems. When I save my project and try later to open it, I get the error “java.lang.NumberFormatException: Invalid character : in base 64 string.” Huh?

    The second problem is that although I’m entering mimetypes in the “Exclude content type filters” box, one per line, the types are all still showing up in my sitemap, as are some image types when I’ve excluded images.

    The first problem is bigger than the first, as I’m faced with re-entering in all of my update frequencies and priorities if I can’t reload the project. Yikes!

  188. Jim says:

    Hello Herve,

    Thanks for the kudos. If you can do it with a regular express (complex:), then yes, you can do it with the sitemap generator, however, I don’t know what that expression would be. If I get time, I’ll look into it but that won’t be for awhile.

    If you figure out the expression, will you please post back here for all to see?

    Thanks and have a great day!

    Jim

  189. Jim says:

    When I go to your main page, it redirects me to /default.html

    When I click “Skip Navigation” it takes me to index1.html which gave me an error the first few times (you need to fix this) I tried to load it.

    On index1.html you have the header link to /index.html which then redirects to default.php which then redirects to yet another page.

    Solution, you need to enter your website/index1.html for the site address.

    This is the extent of my help, however, you have some issues that are going to impact the way your site ranks in the search engines, so I’d get that initial navigation straightened out!

  190. vinoo says:

    Hi,

    After I enter my site on auditmypc xml sitemap tool (malabarhouse,com), when I click on the Start button, it never seems to go beyond 1 page. Is there any reason this is happening? Any help would be appreciated.

    vinoo

  191. Hervé says:

    Hello Jim

    My questions are too easy ?
    My english is poor, sorry

  192. Jim says:

    Great news Keith – glad you found the problem and all is well – nice site by the way :)

    Best regards,

    Jim

  193. Keith says:

    Jim,

    Fixed my problem. The way I had my code written, it was showing parts on pages in the crawler than it should have been. Changed the code, now the sitemap reads it correctly, and works fine. Thanks for the help.

  194. Keith says:

    Jim,

    Thanks for the quick response. I should have added this in the previous post:

    My filters to exclude pages are as follows:

    admin
    DESC
    ASC
    login
    logout
    rfqadd
    sitemap

  195. Jim says:

    Key Keith,

    Notice as the sitemap generator crawls your site it finds a large number of urls that should not be included in the sitemap, like this:

    /rfqadd.asp?pn=MOT00025&url=/sale/motors.asp?
    /rfqadd.asp?pn=MOT00026&url=/sale/motors.asp?
    /rfqadd.asp?pn=MOT00027&url=/sale/motors.asp?
    and many, many more….

    These are pages that when visited by search engines, actually add products to a shopping cart / quote. Here is the message:

    MOT00027 has been added to your quote.

    What would you like to do next?

    Click Here to view your Quote.
    Click Here to go back to your previous page.

    Eliminating these and other such non relevant pages will speed up the process and prevent your memory issues. Create as many filters as necessary and then save those filters for the next time you run the sitemap generator.

    By the way – I’ll bet if you have some type of confirmation before adding to the cart / quote, you’ll free up a good number of resources on your server :)

    It’s all about finding a pattern for the filter. Start generating the sitemap, let it run for a few minutes, stop it, find the pattern and make a filter, then create a new project and start it again; do this until all (or most) of the non-important urls are eliminated. Note: If you don’t create a new project and simply continue, any filter you add along the way will not have any effect – it needs to be a new project.

    Also – If you want to save yourself some frustration, make sure you copy and paste your filters to your clipboard (or wherever) BEFORE you start a new project as it will clear out all the fields.

    Best regards,

    Jim

  196. Keith says:

    I am trying to run my website conveyorandparts,com through your sitemap generator which is about 70,000 pages. However this tool seems to bog down after a couple thousand. I get memory low errors. I tried it on my machine which is a Win 7 Ultimate, Xeon Processor, 4 gig machine. I also tried to up the memory on Java to 4 gig by typing in -Xmx4000m but i still have the same issue.

    Any thoughts or ideas? I really like the tool, it is very easy to use and does a great job. I use it on smaller website and it is awesome. That is why I would rather troubleshoot this issue than find another tool. Thanks in advance.

    ~Keith

  197. Hervé says:

    Hello, good job

    I have just a simple question

    I would like exclude URL who have 2 x “start=” in the parameters
    -example.php?john=1&start=10&doe=doe&start=20

    it is a complex expression but is it possible ?

    When it’s a .mp3 I have always the tag in the sitemap, the tag lastmod

    Merci of France

  198. Jim says:

    Okay Dima, here is the problem.

    Whenever you type in your website address and my sitemap generator reports it can’t find webpages that you know exist, the first thing you should do is click on the URL Check tab inside my sitemap generator application (it’s more than just a sitemap generator, it’s an awesome SEO tool and server header checker as well). Enter your website address in the URL field, select “to content view” and press start. I did this for your website and I’ve taken a screen shot which you can see below. What you’re looking at are your server’s header...

    taxes for expats sitemap error

    I circled the problem. Your server is returning a 301 for the main page whenever someone types in the website address – this is serious and you NEED TO FIX this right away. Your server is telling the search engine bots that the site has moved permanently to a new address, that address being in the location header field (HTTPS), see it?

    My sitemap generator sees, as do the bots, that the address you started with is invalid or has moved. If your site stays like this, you may have a serious impact on your rankings and it may take a long time to get back where you would like to be.

    I stopped reviewing your site at this point and didn’t look further for errors.

    Enjoy :)

  199. Dima says:

    Hi Jim,

    I’m working on taxesforexpats,com. It’s HTML site with about 150 pages. I appreciate any help.
    Thank you for quick response!

    Kind regards,
    Dima.

  200. Jim says:

    Hi Dima,

    What is the website you’re working on – I think I have an idea of what it might be and it’s a real simple solution :) Not what you want to hear, right? :)

  201. Dima says:

    Hi there,

    You have created a great tool and it did a great job for me.About month ago I had generated sitemap and exported it as HTML (sitemap only). Now I need to update the list and I used this tool exactly the same way as I did month ago, but export file contains only one entry. I have tried in FF5 and Chrome 11, result is identical. No matter what rows I’m selecting, all other export types are saving full list of items, but HTML (sitemap only) saves only 1 item. Whn I’m deleting that item from the sitemap results screen and try again it saves new list with only one item again. I tried both browsers on my second laptop, result is the same. Both computers running Windows XP. Java is set to update automatically. Last update was 11.07.11.
    May be I’m doing something wrong, please help. I really wasted more than 2 hours playing with options, reading comments, trying everything. I’m PHP developer and cant predict what possible issues can be, but I can clarify a situation, just tell me what additional technical details you need.
    Waiting for your response.

    Kind regards

  202. Jim says:

    I’ll look into the time lag and thanks for the mention on the row filters to filter out the 301’s!

    Have a great day Tim!

    Jim

  203. Tim says:

    Jim

    Yes I have been selecting them all, but to delete them took just too long (on one occasion I waited more than 2 hours and my machine is pretty good and I gave Java oodles of RAM for itself).

    Just before I gave up though, I used the Row filter….. and filtered out all the 301’s from there. Took no time at all then.

    Great app by the way, unlike most of the rubbish (paid as well as free) you get!

    Many thanks
    Tim

  204. Jim says:

    Hi Tim,

    Are you deleting the 301’s individually? If so, you can sort the list, then click and hold the first occurrence, scroll down to the last 301, then right click and then choose remove selected entries.

    Let me know if that helps :)

  205. Tim says:

    Hi

    I have my IIS change all my URL’s to lower case. Is there a way to exclude all 301 redirects as a result of this, as deleting all the 301’s before saving the sitemap takes ages, especially as the site is over 10,000 pages anyway.

    Many thanks

  206. Denis says:

    Hi, this is great tool!

    I have a problem to solve. I need to create sitemap from pages that link out and surrounded with the bold html wrap tag only. with bold html wrap

    Is this possible? To grab only bold links to sitemap.

    Thank you,
    Regards

  207. Bo says:

    Re: Sitemap workaround for Javascript menu

    OK, I added a sitemap to the root directory linked on the home page schurchwoodwork.com, scroll down to Sitemap can click.

    I ran another crawl and all the files and directories were listed in the sitemap.

    BUT, would you please verify that Googlebot likewise will be able to do the same – or must we install the sitemap on the bottom of the home page in the same color as the background.

    I must thank you again for such a wonderfully engineered application, a big shock to the system after so much garbage programming…

  208. Jim says:

    Hi Bo,

    Thanks for the Chai and yes, that would solve your problem – however, I would simply include them in your homepage, perhaps at the footer until you get everything straightened out. As for the sitemap generator, it does not spider javascript menus.

  209. Bo says:

    Jim:

    Thanks for the quick response…

    I just took over the maintenance duties of schurchwoodwork,com and was totally unaware of the menu / javascript problems with search engine bots, and, apparently your application XML Sitemap Generator – is that why your app can not crawl e.g. this directory schurchwoodwork,com/portfolio1/index.html ?

    This side of recreating a menu could I create a separate html file similar to this page schurchwoodwork,com/blind_urls.html to get around the problem – that would allow bots to access pages and directories currently linked through the javascript menu system? Or is there another preferred workaround?

    Thanks in advance for your advice and excellent app…

    PS: XML Sitemap Generator worked perfectly on another site that I created with standard menu system… Will be buying you a cup of chai soon!!!

  210. Jim says:

    Hi Bo,

    I see the problem right off – first thing I did was view your website without javascript enabled and you’ll see all your navigation disappears. Same happens when I view it without styles – which I have not looked into further).

    I see a number of pages in Google and bing for your site, but there are going to be a number of bots and other devices that won’t read javascript to extrapolate the menu.

    Your navigation should be visible with or without javascript. I’m confident you’ll receive a lot more traffic when this is fixed :)

    Best regards,

    Jim

  211. Bo says:

    Wonderful application. But, there are a number of html files and directories that are missed by Sitemap Generator Webmaster Tool. I disabled the robots.txt option, didn’t help. Hit the ‘retry failed’ button, didn’t help. How can I force Sitemap Generator Webmaster Tool to crawl missing files and directories.

    Here is the URL in question schurchwoodwork, com

    Many of the html files and directories are not being crawled.

    Thanks…

  212. Andy Boehm says:

    This tool is great! Amazing work. I’m having an issue however. My site map is generating urls with the slash character code (%2f). What would be causing this? We are using Ektron to create url alias to many of our pages. Could this cause the sitemap builder to read the character code instead of converting it to a slash?

    Many thanks!

  213. Carmen says:

    I ran the sitemap tool and created a site map, but realized that I made some errors, so I wanted to go back and do it over again. This time when I click the link to go to the sitemap tool, it won’t load – just gives me a blank gray page. Java IS enabled. Any ideas?
    Thanks.
    Carmen

  214. Alyson says:

    Hi,
    I found your generator last week on my quest to learn how to create a web site generator. I watched the video and read the remainder of the page, then I gave it a try. Today I wanted to study it again to learn the ins and outs and see if it is working, but the page is blank!
    What happened to the code?
    I am still learning about the web, but I read everything I can to advance my understanding.
    Thanks

  215. Jim says:

    John,

    What you’re asking is for me to take a portion of my time, run your sitemap generator on your website, find out why you’re having problems and report back, right?

    That’s a lot of extra time above and beyond what I’ve already done to help with the tool, instructions and video – such a task requires Chai to motive me :)

    You can buy me a cup (see the link in the right right horizontal menu bar on this page) and leave your website address there and I’ll take a look.

    Best regards,

    Jim

  216. John says:

    Hi JIm, this looks like a great tool – I watched the 14 min vid and am really excited about this product. The ability to sort just the web page you want in your xml file is superb – Now all I need to find out is why I am getting errors on pages that DO actually exist.
    I’m sure it’s just soemthing I’m doing wrong – if you could advise that would be great.
    Is it possible to send you the URL in question without making it public here.

  217. Ben says:

    Hi Jim

    Many thanks for the great tool!

    I was wondering how you calculate the column “Level” in the csv-file. The site I’ve analysed gives me for some pages Level = 2 where the html-sitemap shows me that the level should be 5.
    It would be great if you could help me here :)

    Best Regards
    Ben

  218. Jim says:

    Timothy,

    That’s not my sitemap generator randomly adding forward slashes, it’s a bug in your site and you need to fix it as it’s doing this with Google, Microsoft, Yahoo and other bots!

  219. michael says:

    Hi
    I ran the crawler and came up with no errors. BUT none of my pages have titles. How canI add them?
    Peace,
    michael

  220. wasif says:

    Thanks! This was fast, easy, and found one missed page…

  221. Elvis says:

    I am having hard time excluding certain parameters from the sitemap. The URL is like
    853-bla-bla-bla?orderby=price&orderway=asc.

    How to exclude the orderby and orderway from the sitemap?

  222. osvaldo says:

    If I try to add a stylesheet on xml sitemap of site bestholiday. fr the sitemap is broken.
    No .xsl file is generated with the sitemap.

  223. Jenny says:

    Hi Jim,

    Thank you for a fantastic tool :-) However, I have an issue which may be an issue with my cart but would appreciate your thoughts so I can be sure please and if you have any ideas what I can do to solve it.

    Basically the Sitemap is crawling the index page of my e-commerce shopping cart and also 2 other cart php pages (cart and gift certificate pages) but that it is. None of the category or product pages are showing up at all in the Sitemap crawl. I am worried that this also means that if Sitemaps can’t crawl the pages then search engines like Google won’t be able to either. I am very worried as I have spent weeks and hundreds of pounds setting this cart up :-( What do you think please? Any ideas from a pro much appreciated.

    Jenny

    PS Happy to email you my shop url. Or please see my email address, which will lead you to the online store.

  224. C. Jeff Dyrek says:

    I was never able to get the generator to even give me a start page. I ended up installing JAVA and the generator page would just appear with some text links at the top and there was no generator at all. What did I do wrong.
    C. Jeff Dyrek, Webmaster, Polar Explorer

  225. HIlary says:

    having problems getting the sitemap I created on your site uploaded to Google. I am entering my URL followed by sitemap.xml. I have put into the header of my index page the robot.txt to allow and my sitemap appears to have been submitted successfully to the other engines but not sure what is wrong with the Google one.

    Any suggestions? You can ignore my last question.

  226. Timothy Horrigan says:

    The ‘bot kept looping through my site and added every item which was in a subdirectory multiple times, only with extra /’s: e.g., a file in the “/documents” directory was added //documents and then again as /// and so on.

  227. stormy waters says:

    HI, I think this is great! I am a total newbie to all of this and your site and video helped me tremendously..guaranteed to buy you a Chai tea. I do have a question though. I have tried two times to upload to Google (site verifed ok) but it says it can’t access the location and should follow their guidelins. I am sure the sitemap is following their guidelines coming from you it must be. so it must be cannot access the location. what do I enter after the website address?
    I saved the file as c:UsersDocDocumentsSitemap.xml but this does not seem to work. I tried a few different things. what exactly should I be entering in the field after my url address?

    many thanks

  228. Jim says:

    Hi Drew,

    Check out my comments here and look for the link to test your version of java – you have the wrong version, updates are free…

  229. Tim says:

    Hey Jim, thanks for this great tool first off. My company, as well as myself, are new to SEO and figuring it out for the first time. I know that submitting a sitemap to Google, Yahoo, etc is very important, thus why I am doing this.

    I was wondering if you could help me out a little bit though. When I run my sitemap, I come out with about 3,000 pages. Should I submit all 3,000 of these pages to google, or should I shorten my xml sitemap up to only include the ones that I feel are most relevant.

    This is probably a stupid question, but I appreciate it. Also under any specific category, for example, Cricut Cartridges, I have several pages of results,
    craft-e-corner. com/c-256-embellishments. aspx?pagenum=2
    craft-e-corner. com/c-256-embellishments. aspx?pagenum=3, etc.

    Should all of these pages be submitted as well. Thanks so much for your help Jim, I will def. be buying you some coffee, as you have saved me hours of work and time.

    Thanks again.

  230. Alexej says:

    Спасибо большое вашему сайту
    Очень полезный ресурс

  231. Jim says:

    Hey Duane,

    What is the website address that you trying to build an xml sitemap for?

  232. Duane says:

    This is a great tool and I’m thankful that you have it and make it available for us to use.

    I’m having an issue though where the crawler is returning all of my .html pages marked as a HTTP 302 (in the sitemap tab), however if I go to the ‘URL check’ tab and enter the URLs there they show as ‘HTTP/1.1 200 OK’ at the top of the “Header fields” section.

    Clearly with the pages marked as 302, they aren’t being exported in the sitemap. Any ideas?

  233. Jim says:

    Hi Hal,

    Key here is “Exclude Filters”!

    For mywebestore.com software, use the exclude filters below – you should be able to generate a complete sitemap in a matter of minutes using them.

    *noc=true*
    *?b=1*
    *loginredir*
    */gbook/*
    */comments/post/*

    If you run the sitemap generator without filters, you should see why the file is becoming so large! This software package has comment and post sections with a register and login redirect using random unique urls. If you follow these urls, they again present more random urls thus creating an endless loop of urls!

    With my sitemap tool, you can create filters which will tell the generator to ignore these random unique urls; however, you should address the real issue and have the programmers of that software make them nofollows or exclude them in your robots.txt file. If you don’t, search engines bots may end up spending a ton of time spidering worthless urls and give up on real pages or worse.

    Generate your sitemap with my tool using the exclude filters, upload the new sitemap to your site and you’ll be all set.

    Best regards,

    Jim

  234. Hal says:

    I have a sitemap that I am happy with now but the file size seems very large and when I upload it on the site it does not open in the browser. When I open it from the desktop with IE it has a note on the bottom about blocking script or an ActiveX control. The file size is about 9000k after uploading. I am using a store software package by mywebestore.com

  235. Jim says:

    No problem Kevin, glad it’s all working for you!

  236. kevin says:

    Hi jim

    Thanks for the clue the JAVA . I removed all java from computer , updates everything . then ran the sitemap twice with full normal coverage , but noticeably in half the time to complete! strange, not sure if the java removal will effect other programs , will let you know how it goes when run next maps ! thanks !

  237. Jim says:

    Ah, glad to hear that! Nothing has changed with the sitemap generator (I have not made any changes in the last few of months), so the only thing that could be an issue would be something on your computer. If you try this from another machine, you’ll see it works fine. My guess is something with your version of Java.

    Give me your website address and I’ll build an xml sitemap myself and I’ll verify all this…

  238. Kevin says:

    It stated about a week ago , managed to complete 1 map this week , tried yesterday on too different occasions from a Firefox and also google chrome it read all ok to the total , but then when you wait for to go through it before you can export the screen where the map is running just goes black , the rest is ok it always just the centre!

  239. Jim says:

    Hey Kevin,

    When did you start having this problem?

  240. kevin says:

    I am using the site map generator and have for several years , but recently it reads the 12,000 sections ok but then as it process them all the screen just goes Black where the site map data is , !! all around is normal , you can hear the the hard drive running it but the the map never returns so you left to exit the program with no results !!

  241. Drew Davidson says:

    I have tried to run your program on Firefox 4 with Java enables get the warning and request for cup of cha click OK then a blank page what am I doing wring?

    Drew

  242. Jim says:

    Hi Randy,

    Where is the location of the sitemap you made with my tool? Once I have that, I can help you. If it’s a sitemap created elsewhere, you’ll have to ask that publisher for help.

  243. Randy says:

    Hi, I have a xml sitemap, but none of the search engines are picking up on my /pages.html…just my index page. So I am losing traffic when someone wants redworms and more. What is the problem with what I am doing? wormsandgarden dot com

    Thanks, randy

  244. Bart says:

    Hello Col,
    gave this tool a try and it worked flawlessly. Any plans to provide an upgrade path to host, automate and allowing scheduling and ftp upload of sitemaps.. in short “set and forget”?

    Bart

  245. Jim says:

    Hi Ruben,

    No problem…

    Hey, I created an xml sitemap for your site and didn’t see any problems show up? It’s error free…

    If you run the sitemap generator on a website, then made changes while the browser is still open, it will not register the changes. You have to start a new project and re-generate the sitemap.

    Hope that helps!

    Jim

  246. Ruben says:

    Hi Jim,

    Thanks for your response. That’s what I did on the picture attached. It’s crawling my site, but not the actual links. It has no sense, I always get error on 3 pages linked from a main page (airmalaga.com/malaga-airport/malaga-airport-car-hire.htm), but I’ve checked that page, and those link does not exist. Furthermore, it doesn’t get any new page (I’ve created, linked and uploaded several new pages, but it always makes an wrong sitemap). It’s like the xml generator has any kind of cache and doesn’t try to crawl my site again.

    I’ve tried to remove my .htaccess and robots.txt, but I have the same problem. Removing my browsers’ caches didn’t work.

    Maybe our servers blocked your ip or something similar, I don’t know.

  247. Jim says:

    Your robots.txt file would look like this:

    User-agent: *
    Disallow:
    Sitemap: http://www.google.com/sitemap.xml

  248. Jim says:

    By old version, what exactly do you mean Ruben? You mean why it a page showing in the sitemap as an error when it doesn’t exist? If that’s the case, simply look at the row with the error and under the In-L column, click the + sign and you’ll see the pages creating the link – that is where the problem is :)

  249. Ruben says:

    I don’t know why it is still getting an old version of my website airmalaga.com. I tried with firefox, IE9 and chrome.

  250. mangaradja says:

    Is it ok to write xml in robot text to my site like this:
    Sitemap: http://www.google.com/sitemap.xml

    Please give some advice, I’m newbie of this stuff
    Thanks sir

  251. Jim says:

    First try on the site I received

    HTTP/1.1 403 Forbidden Server: Lusca/LUSCA_HEAD-r14756 Date: Sat, 23 Apr 2011 12:47:25 GMT Content-Type: text/html Content-Length: 2217 X-Squid-Error: ERR_ACCESS_DENIED

    then changed the browser type and received an 200 (ok) and it spidered three pages.

    Can you give me an example of the parameter links you talk of? How many pages do you think should be showing?

  252. Pearl Magpie says:

    re: comment-18337
    Thank you Jim,
    you were correct of course i updated Java and it works great now, sorry for the delay in acknowledging your help
    best wishes, Pearl Magpie

  253. Solutions says:

    Hi,

    I am trying to get a site map for the site [snip] but not able to do so, it has parameter links, any ideas?

  254. David says:

    Greetings,
    I’d like to try the Sitemap Generator. It looks very impressive. Is there a link to try it out?

    Thank you very much.

    David

  255. Jim says:

    Thank you so much for your kind words Sandy! I’m glad to hear the tool(s) have helped you and appreciate the donation (Chai and Hot Dog) :)

    Best regards,

    Jim

  256. Sandy says:

    So thankful for this GREAT tool. I so appreciate your generous spirit. It’s pretty difficult these days to find someone giving back. I work for a church and I have no IT budget to speak of so, I really am happy to have discovered you. Thanks, Again.

  257. Jim says:

    Hi Pearl,

    It’s your settings and my guess is that you are missing java or have the wrong version. On the sitemap generator page, I have a link to java where you can verify that you have the latest version of this tool.

    Best regards,

    Jim

  258. Pearl Magpie says:

    Dear Jim,
    You have got me dringing “Chai” now! i have 2 computers on my home network and the first one i used to generate a sitemap works great however the other will not allow me to crawl my site, the page is blank, like before the program loads. is this my computer settings or is it designed to only allow 1 access from the same ip address? by way of explanation, i have 1 computer in the shop below the flat where the other computer lives and due to the 6 hours it takes to crawl our site i would like to do the crawl on either computer as time allows. thank you and let me know when you would like a refill…..PM

  259. Jim says:

    It still works perfectly :) – I just ran the sitemap generator on your site and didn’t see any errors.

    In the future, let me know the exact url in question – it will make it easier for me to troubleshoot. If, when you see the error, you click on the plus, you’ll see the url that is making the invalid reference (has the bad link).

    Glad to see all is well again :)

  260. Ruben says:

    Hi, it worked perfectly for years, but now I’m facing a problem that drives me crazy. I don’t know why, the generator crawls an old version of my site (I have the same problem with different websites over several servers).

    If you try to crawl airmalaga.com you will see 3 broken links, linked from the car hire page, but if you check the actual page, those broken links does not exist (those pages were removed some months ago).

  261. Jim says:

    Bet you a buck it’s because you don’t have your URL correct. Something like forgetting the www or the http portion of the url. Visit your site’s main page first, then copy that URL (location) and paste it into the sitemap generator and you should be all set.

    If it still does not work, then great! Because that means you have a SERIOUS error and the sitemap is detecting it. Send me your website address if this doesn’t work and I’ll attempt to take a look.

  262. Jim says:

    Hi John,

    You exported (created) an html page, instead, choose”Sitemap XML” and you’ll have the correct sitemap which you can submit to Google, Microsoft, etc.

  263. John says:

    I used the Sitemap Generator tool, saved the sitemap as xml, and submitted to Google and i get the following meessage from Google:
    Sitemap is HTML
    Your Sitemap appears to be an HTML page. Please use a supported sitemap format instead.

    My sitemap can be found at holidayathomeshop. com/store/HolidayHomeSitemap.xml

    any ideas on what is wrong?
    thanks in advance.

  264. James Newton says:

    On my system, I seem to get an increasing number of “sleep interrupted” errors…. Any idea what causes that?

  265. Dave says:

    Hi,
    Haven’t used sitemap generator for a while but when I try now it stops after accessing my homepage and throws a Java nullpointer exception. Help please?

  266. Angelin says:

    Great tool, but I have a problem using it… I am no using rel=”nofollow”, but rel=”nofollow,noindex”. Maybe this is why I see my nofollow pages in the sitemap right? How can I exclude them?

  267. Karl says:

    Thank for the excellent tutorial, I used the java sitemap generator and it works just as well as commercial software.

    Thanks for the video too – top stuff

  268. Warren says:

    Jim,
    This is an awesome tool! Thank you for providing this to us ‘less than geeks’ finding our way in the SEO world.

    Question… when adding “Sitemap: /mynewsitemap.xml” to my robots.txt can I have two lines for the Sitemap: location? I use your tool for my general pages and I am provided a sitemap for the thousands of products pages. I would like the search engines to see both sitemap pages.

  269. Jim says:

    Try these:

    *clientscript*
    *members*
    *search.php*
    *archive*
    *cron.php*
    complex:[0-9]{1,9}-post[0-9]{1,9}.html
    complex:^.*post[0-9]{1,9}.html
    *attachments*

    That last will remove the links to single posts that are part of the thread which you would want to turn into comments of a WordPress post. such as 3418-post1.html

  270. Nomoney says:

    I’m converting vbulletin to wordpress and wondering what I should use as exclude filters?

  271. Cova says:

    Hello Jim,

    Congrats on this really helpfull tool. Been using it for more than a year now and i’m totally satisfied as a webmaster. From the large variety of free tools of this sort, this is most helpful for a site’s internal diagnostic. It certainly is more than a sitemap generator.

    My thoughts on this though, as i am writing only now a comment, is that it will be most helpfull if you can integrate a module to track internal anchort text (or images alike).

    I’m now developing a visual tool similar to this but with a more graphic interface (more like visual thesaurus) and i was thinking maybe to collaborate.

    Anyways, keep up the great work!

    Best wishes,
    Alex from Romania

  272. Jim says:

    Thanks Berny, you made my day!

    If you make changes to the site and want them reflected in your sitemap, make sure to start a New Project from the menu – you’ll see the changes then.

    Thanks again!

    Jim

  273. Berny says:

    Can’t thank you enough for this fantastically useful web site. I’ve tried other site map generators and they were all useless.
    One problem I did notice was that when I edited some incorrect page titles on my web site, Sitemap Generator did not recognise the changes. I ran it 3 or 4 times without success. However, I ran it again the next day and it recognised the changes. I suspect it may be something to do with my hosting company.
    In any case, it’s definitely worth a cup of chai.
    Thanks again.

    Berny, UK.

  274. Jim says:

    Hi Bill,

    What is the url of the site that has gone missing? No worries, I won’t publish it but can’t help you without it.

  275. keith harrell says:

    Hi Jim — Big problem for me! I am a novice by the way.

    Once I turned off anti-flood in Joomla your generator worked brilliantly. I used it same way for 3 sites and uploaded all via webmaster tools.

    They have all be re-indexed withing hours by Google. 2 of them are fine with hundreds of extra indexed pages, but one site only goes to a blank white page and my site has completely disappeared!!
    The only thing I did different was to make the change frequency ‘never’ on my 2 x homepage urls.

    I had 500-600 pages found by your generator but gone to do another sitemap and it is showing ONE url only, my home page. Seems like something is blocking things.
    Can you advise please?

  276. Jim says:

    Hello Alex,

    Well thank you, for your complement and your suggestions :)

    1) That is incorrect, the sitemap generator does respect the robots.txt. Provide the url of the the site in question and I’ll tell you want you have configured incorrectly. As for caching the robots.txt file, you need to shut down all your browser windows after modifying the file.

    2) You’ll be happy to hear that you now have the ability to include PDF links, or images, or anything else in your sitemap. If you don’t want it to appear, delete it from the results before you create the sitemap file (Export XML).

    3) As for the top and bottom having to be adjusted, that is the only way I could think of to show my donate links – if forces them to have to acknowledge the plug in order to use the generator. Very rarely do people buy my a cup of Chai, and I see a TON of repeat visitors; I’m thinking of a subscription ($5 per year) and placing the tool on a password protected page – then you wouldn’t have to worry about scrolling…

    If you can think of a better way, please, let me know!

    And again, thanks for the suggestions on PDF and Images – Enjoy :)

  277. Alex says:

    Great App!!! Thank you!

    A couple of questions:

    1. It seems that “respect robots.txt” option is not working correct. When the option is on a lot of my urls are skipped by this rule. But I checked my robots.txt and there’s nothing in it that should block that urls. I’ve got a lot of disallow rules a couple of weeks ago, but since than I’ve changed my robots.txt several times.
    It seems that sitemap generator uses a cached robots.txt. I’ve tried to empty robots.txt on server, to purge browser cache – the result is always the same.
    I’ve checked logs on server. Sitemap Generator is not downloading robots.txt on its start.

    2. What I miss in the SiteMapGenerator is an ability to export links of PDF files and pictures to sitemap.xml. Google support images links in sitemap.xml. Not sure about PDF links.

    3. Even at screens with resolution 1900×1200 users should scroll screen up and down to see top buttons (Project/settings/Sitemap etc) and bottom buttons (start, stop, statistics etc). It would be more convenient to have all that buttons and statistics on screen without scrolling.

  278. Jim says:

    Thanks John, your kind words made my day :)

  279. Jim says:

    Hello Stephan,

    No, you can’t schedule the sitemap generator to run automatically and then ping Google, it must be completed manually. However, I shall think about this and perhaps I can work something in…

    Best regards,

    Jim

  280. John says:

    This sitemap generator is truly a great tool for webmaster, and the creator of it is so generous that allows people to use it for free, he/she must be also truly a great person.
    Thanks you very much!

  281. Stephan Brandligt says:

    Hi,

    Can you schedule Sitemap Generator to run say every day? And can it automatically send a http request (ping) to Google to notify a new sitemap is ready?

    Kind regards,

    Stephan Brandligt

  282. Jim says:

    Hi Jeremy,

    I just sat down with my morning coffee to check out your site and glad to hear you the sitemap generator found the problem!

    Thanks for the Kudos!

    Jim

  283. Jeremy says:

    Hi,

    Please ignore my previous comments. The problem was between the chair and the keyboard! I had not linked correctly to the new pages… So in short your Sitemap generator was the only one to pick this up and report on this correctly. Even the paid ones did not report this!

    Thanks for creating a great generator.. Let me know how we can support this project!

  284. Jeremy says:

    Hi,

    Great tool… Without doubt the best one out there! I have been using it over the last couple of days to create a sitemap for my site. It has been very helpful at notifying me of broken links which I have fixed however I have gone today to run the final scan (hopefully no more broken links) but it is not indexing the new pages I have created today. The folders and contents it is not indexing are:

    I have no entries to block these in Robot.txt and the other directory and contents I created a couple of days ago are indexed fine and are in the same parent folder.

    I have attempted to clear my cache on my computer but it just does not seem to work. If you could help it would be greatly appreciated.

    Kind Regards.

    Jeremy

  285. alex says:

    Hi,
    i need you this web site – AuditMyPC.com’s Site Mapping Tool report.
    Do you help me?

  286. Jim says:

    My sitemap generator does respect sessions, so if you log into the site with the password and then fire up the sitemap generator with the same browser (open a new window, tab, etc) you can spider it that way – but, remember that the sitemap generator will follow all links and if you are the admin and there are links to delete something without confirmation, then those links will be followed.

    My sitemap generator is the only one that respects sessions that I’m aware of.

    What type of password system is set up? Is it a session variable, password prompt, etc. Have an example?

    Thanks!

  287. Ezekiel says:

    AMAZING TOOL. And thank you very much for the how-to video!

  288. Conrad says:

    I would like to create a graphical map of my client’s website however it is password protected.

    Are they any tools that would allow me to map a website which is password protected, but for which I am the developer and possess the password?

  289. Jim says:

    Hi Miguel,

    No, you can’t run this from a cron job. It is not automatic, but if you have the majority of your pages in the sitemap, google, bing and others will find them. Some of my top performing sites have been performing well for years and I’ve only updated the sitemap a few times a year.

    And thanks for the kudos :)

    Jim

  290. Jim says:

    Hi Michael,

    I have very little time, but I can take a look – what is your site and what don’t you want the sitemap generator to index?

  291. Michael says:

    Hi Jim,

    I have a Zen Cart site and want to build a site map for it, however I have a bunch of items (dynamically created pages) that I do not want to index. I have read through your instructions and watched the how to video, but am far too much the novice to make any decisive and accurate decisions on how to go about this.

    Can you please let me know a good place to start and any help is GREATLY appreciated.

    Thanks!

    Michael

  292. Miguel says:

    Hello. First of all, Nice work.
    I have a question, once the sitemap is created, installed on the root and submited to search engines all the actualization is automatic? I have been using another service and i had to set a cron job to run the crawl. Thanks.

  293. Herman says:

    Hi Jim,

    Thanks a lot for this web tool…as a newbie that’s great.

    Cheers

    Herman

  294. Caleb says:

    Nice idea. I like the way the sitemap generator was created. It is fast and I’m not seeing any errors so far.

    Caleb

  295. gregory claeyssens says:

    hi and thank you for the great tool,
    I have made a website with php and some of my url’s end with a php variable like “claeyssens .be/productenGallerij.php?&pad=Fotos-Tekst/Artikels/Monturen/Anne%20et%20Valentin” (without the quotes of course). When i use the tool it generates this link in 2 forms, the first ending with “valentin”, the second ending with “valentin%20”. Why 2 times? And why with one ending with that %20? Because the latter case is a bad link. I have tried ending the url’s with a “/” but the same thing happens (more or less), now the link appears in 3 forms: with /, without / and without / but with a %20 at the end. I have also checked all links in the site, they are fine, no problems there.
    The site crawled is claeyssens. be
    I would really appreciate some help or some enlightenment on how the crawler thinks and works.
    Thank you in advance,
    Gregory

  296. Jim says:

    Hey KK2 – send me your website address you’re building the sitemap for and I’ll take a look.

  297. Jim says:

    Hi Jason,

    No problem and I’m glad that you like the tool! As for the Earl Grey, I’m pretty stuck on my Chai, but I’ll give it a shot :)

    Best regards,

    Jim

  298. emre says:

    Hello, i tried on different machines and never succeded running this app. My java is up to date. In console i get:
    19.02.2011 00:04:07 – DEBUG – Starting AuditMyPC.com WebTool v1.6…
    Exception in thread “AWT-EventQueue-2” java.lang.ArithmeticException: / by zero
    at jmaster.webtool.view.impl.sitemap.SitemapColumnFilterView.(Unknown Source)
    at jmaster.webtool.view.impl.sitemap.SitemapView.(Unknown Source)
    at jmaster.webtool.view.impl.main.MainView.(Unknown Source)
    at jmaster.webtool.app.WebToolApplet$1.run(Unknown Source)
    at java.awt.event.InvocationEvent.dispatch(Unknown Source)
    at java.awt.EventQueue.dispatchEvent(Unknown Source)
    at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
    at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
    at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
    at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
    at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
    at java.awt.EventDispatchThread.run(Unknown Source)

  299. jason@eldonrv.com says:

    Thank you so very much for (a)this program (b)keeping it free (c)teaching. IT WORKS. And it’s not set up to impress me with how much somebody else knows, it’s set up to USE. Last time I worked in the internet business, I was learning HTML4.0 – now I am trying to make a rather constrictive CSS driven e-commerce site look like a website AND store. The ‘we generate your sitemap for you’ tool only creates HTML, stored as .aspx, and Google just spits it back at me.

    All kinds of other folks promise to help me generate a sitemap, keywords, etc. but honestly they just expect me to put all the info in for them. If I had that much time I could just write the sitemap by hand!!

    Thank you again so much.

    Can I convert you to Earl Grey?

    Jason

  300. KK2 says:

    Great tool and pretty good instructions. The one I looked for I couldn’t find. It seems I have repeating URLs. By that I mean the sitemap has listed some pages many times. Some with http and some with https. Not sure how to correct this. It’s like the engine is circling. Any thoughts?

  301. Mick says:

    firstly great tool!
    I have one issue – the results are very inconsistent for last mod content. Not all my URLs are displaying last mod info in the sitemap.xml

    Every webpage has the correct meta tag – I’m using for a large site too so manually adding is not really an option

    any assistance would be great

  302. John McKay says:

    Hi,

    I have generated the xml sitemap but I don’t know how to upload it to my web host. I use Net Objects Fusion to create my website. Do I have to create a new page in the website then upload this to the host server?

    Regards

    John

  303. suzanne milbourne says:

    Thankyou so much for explaining how to upload a sitemap to my google site! You are the only website I have come accross where there has actually been some instruction on how to upload.. all other sites just say “upload to your site” as if we magically know how to do that! Congrats on getting it right for us technically challenged people! You can have as many cups of Chai from me as you like!! x

  304. Jim says:

    Hello Sultan,

    What was the website address that you entered into the xml sitemap generator?

  305. Sultan says:

    Hello Sir,
    I am very happy to find this free sitemap tool but there is problem when i generate my blog sitemap. when i press the play Button the process Automatically Stopped after 10 Second. ur requested please help me How to create sitemape free ,easy, Quickly.
    Note: please inform me i will be able to Download the Software ?

    please inform me as soon possible

    Thanks

  306. Jim says:

    Hi Cathy,

    Visit java.com/en/download/installed.jsp and check the version of Java you are using. You won’t be able to generate a xml sitemap unless you have Java installed (usually is, and it’s free…)

  307. Palestine Job says:

    i cant find how we make the sitemap where is the tools you are used it?
    we need to crawl my site pal-stu.com

  308. Cathy says:

    Where on this site is the download link for the free sitemap creator?? Clicking the Sitemap generator + Webmaster Tool simply sends me to a page that looks like something is supposed to run, but doesn’t?

  309. Web Design says:

    We want to create a sitemap by your tool, how it would be, please help

  310. soccer predictions says:

    nice sitemap builder, thanks for not charging!

  311. Stuart says:

    Jim,

    I downloaded the latest version of java, and of course now the sitemap tool opens.

    Thank you.

  312. Jim says:

    Stuart, what version of Java (not javascript) are you running?

  313. Stuart says:

    When I click on the Sitemap generator image I get the warning to make sure I’m not logged onto my site etc. then I click OK and nothing happens, I just see the following;

    See our sitemap generator page for instructions and help. Take our free Anonymous Surfing test and protect your privacy! Buy me a cup of Chia Tea. AuditMyPC.com. Firewall Test · Anti Spam · Internet Speed Test · Anonymous Surfing · Website Monitoring

    I have java script enabled, any ideas?

    Thanks.

  314. Michelle says:

    I have exported and saved both the .xml and .html sitemaps to a specific folder and to my desktop. The files are not visible in my dreamweaver folder nor on the desktop. However, if I click on export a second time, the save to window shows that they exist. I did a search in windows explore and they were not found. It’s as if the program is not actually saving the files. Where are my files? I am running on Vista.

  315. Jim says:

    Hello Paulo,

    The problem is the way your server is telling visitors how it is encoded. Do this in the header / meta attribute so the browser (client, etc) knows which encoding to use for reading characters. You can see how your site is encoded by using a tool built into the sitemap generator, here is how:
    In the sitemap generator, open “URL check” pane, put in your main website address into URL and click start. You’ll see the http header including encoding type – try it with another website, then yours and you’ll see the point I’m trying to make – see, the tool already helped you find a problem :)

    Without encoding, the browser will guess at the charset for the page (Western European ISO), but the webtool uses UTF-8 by default.

    Solution: Add character encoding to your site and don’t leave it to browsers to Guess (that can lead to problems : ).

    I hope that helps!

    Jim

  316. Paulo Habl says:

    Hi,

    Thanks for this tool.

    I have run it on my site Pailegal (pailegal. net) which is written in Portuguese and have found the characters in Title come wrongly. Example:

    pailegal. net/relstemot.asp?rvTextoId=1236775313
    N�O BASTA…..
    where should be:
    NÃO O BASTA…

    Could you please advise if and how I could have the correct Title?

    Regards
    Paulo

    P.S. I would like to use this tool to generate the URL/Title and help me in the mapping for a URL Rewrite tasks.

  317. Jim says:

    Hi Frank,

    You submitted a Sitemap Project, not a XML Sitemap. Select Export, then Sitemap XML. Check out the video for a great example!

  318. Frank says:

    Hi Jim,

    Merry christmas!

    I just submitted my generated sitemap and google already picked it up but!!

    In my description text i have this error:

    Warning: SimpleXMLElement::__construct() [simplexmlelement.–construct]: Entity: line 1: parser error : Space required after the Public Identifier in …

    Can you please help, what does this mean?

    Best regards

    Frank

  319. Jim says:

    No problem Art,

    Unique request :) – Is there a unique file in each directory that has the same name, like index.php that is visible? You could use an include filter only for this. It’s going to be tricky – either the right pattern or spider the entire site, then use filters. For example, you could use this filter on the url:

    complex:[A-Za-z0-9-]/$

    That would show all directories that it knows about ending with a forward slash and referenced directly. This filter is not what you are looking for, but it gives you an example of what can be done!

  320. Art Kedzierski says:

    What if I JUST want a directory tree structure report (i.e., no documents)? What’s the easiest way to configure that?

  321. blood tester says:

    How can i create sitemap for wordpress site? My site targets the keyword “blood tester” and is at testermeters dot com.

  322. divproject says:

    Thank you for good service. Made a map quickly and conveniently. All I will advise you.

  323. Mark Garrett says:

    Jim,
    I should have added that I am using the following parameters for the crawler.
    Request Delay – 0.2 seconds
    Connect Time – 65.9 seconds
    Read Timeout – 91.5 seconds
    Transfer Rate – Infinite
    Thread Count – 9
    Auto Save Interval – Infinite

    No includes.

    Excludes
    *.jpg
    *.gif
    exports/*.*
    customized/*.*
    images/*.*
    page/*
    category/*
    /account/*.*

    Regards,
    Mark Garrett

  324. Mark Garrett says:

    Jim,
    I have been able to get the missed pages down to 12. I validated the pages against both the XML map and the AuditMyPC.com’s Site Mapping Tool report. Here are the twelve that are missing.

    BATTLEFORGE
    BLACKANDWHITE2
    BROTHERARMEBLD
    BTLFORMIDER2WK
    JUMP4THJC
    LOCKONJC
    MASSEFFECT
    MATHBLASTER6-8
    MORROWGOYDVD
    MSFLTSIM10XDLX
    TRAINZ2004
    TURBOCAD15DLX

    I really like the tool. However, because of the ?pages and the products? pages the HTML file is generated very weird. I’ll probably have to use an XML to HTML converter once I remove these pages from the XML sitemap file.

    Regards,
    Mark Garrett

  325. TOPSEO says:

    I’d like to thank you once again for your free tool that helps me a lot in my job: the tool is working perfectly and it has lots of options that add real value to it. I have tested with a quite big website (more than 7,000 URLs) and I got a bunch of precious data for my project.
    Thank you Jim! By the way, how was my cup of Chai? :-)

  326. Jim says:

    Hi Mark,

    Give me a few urls to pages that the sitemap generator is not catching and I’ll run it and see what’s up.

  327. Mark Garrett says:

    Jim,
    I have a website Video-Games4U dot com that has 1245 products listed on it. When I run the sitemap generator I am consistently missing 76 products. I am using the following exclusion.
    *.jpg
    *.gif
    exports/*.*
    customized/*.*
    images/*.*
    page/*
    category/*
    /account/*.*

    I have checked the site with another sitemap generator and it does not miss the products. I am at a loss. The site map generates 14 page links to non products and about 30 links to display pages (page? and some page numbers). The site displays products in three columns and that is what these pages are pointing to (page? ). Any ideas as to why is is missing these pages?
    Regards,
    Mark Garrett

  328. Hallee says:

    How do I get a tree of my sitemap?

  329. Jim says:

    Hello Panagiotis,

    I’ve looked at your site and it behaves as though it looks at the browser type and if not a standard browser, gives an error right away. I changed the browser string from AuditMyPC Sitemap Tool to Mozilla Firefox (the drop down option in the sitemap generator’s settings section and it started spidering; however, after spidering about 10 pages, I then received a forbidden error on the remaining pages hinting that something is looking at the rate of spidering. I changed the delay between requests to 2s under the Crawler tab and it seemed to spider more, then started to deliver Forbidden errors again.

    The messages you see under ‘problems’ are for internal use and are standard when receiving such messages from a website.

    If your server is producing forbidden messages for my sitemap generator, then chances are real good that Google’s getting the same thing. Try the sitemap generator again, make the changes I’ve suggested here and see for yourself.

    Best regards,

    Jim

  330. Panagiotis Sidiropoulos says:

    By using on-line sitemap creator, I’m getting process stopped just after a few only URLs indexed and all URLs with non-latin (Greek) characters failed. Here is the Problems report:

    07.12.10 12:05:17, Error: Fatal error, cause: java.lang.InterruptedException
    07.12.10 12:05:17, Error: Fatal error, cause: java.lang.InterruptedException

    If someone is interest in testing my site, it’s base url is texnikosnet [dot] gr

    Any help would be most appreciated.

    Regards, Panagiotis

  331. MC says:

    Heh, just noticed that last comments comes on top. About my previous problem I think , if I change robots txt, I have to restart firefox in order that sitemap generator takes in new robots txt contence.

  332. MC says:

    when you check respect: <a rel = "no follow"
    does it mean that it respect "nofollow" without space, a sit should ?

    It would be good to have additional :
    check respect: <a rel = "noindex"

    regards

  333. Jim says:

    Hey MC,

    I can’t help you with a web address to look at – I need to see your setup in order to solve your xml sitemap problem.

    Let me know…

  334. MC says:

    my robots.txt:

    User-agent: Googlebot
    Disallow:

    User-agent: *
    Disallow: /img/

    Unless I check off “respect robots txt” sitemap generator crawls nothing ?

  335. Jim says:

    You are welcome Andy!

  336. andy says:

    Grazie per la sitemap

  337. سعودي كول says:

    thank you

  338. joydev says:

    hi
    do you know you have done so much good to all novice and fresh website owners and general public by posting all this information at one place. Truly knowledge like happiness is doubled when shared. Thanks a ton buddy.
    hats off to you and your effort

  339. Jim says:

    Hi Chris,

    Yes, looking at late next month if time permits…

    Thanks!

  340. Chris says:

    Have you got any plans to include video sitemaps into this great tool?

  341. Jim says:

    Hi Mark,

    You exported a “save project file” rather than the XML Sitemap Google needs. Here is what you needed to do:

    1) Run the sitemap generator on your site.
    2) Click on the Sitemap Tab
    3) Click on the Export Tab
    4) Select Sitemap XML from the drop down list

    Those are the steps needs to create the sitemap xml file that you’ll submit to Google.

  342. Jim says:

    Hey Mark,

    So what exactly is the name of the sitemap file? Or better yet, what is the exact website address of the sitemap file and I’ll take a look.

  343. Mark Garrett says:

    I ran the sitemap generator and cleaned up problem area. I set the change frequency on the pages that change frequently (weekly). I then downloaded the sitemap to my site video-games4u I submitted it to Google through the Webmaster Tools. Google gave me the following error.

    We encountered an error while trying to access your Sitemap. Please ensure your Sitemap follows our guidelines and can be accessed at the location you provided and then resubmit.

    My robots.txt file shows the following.
    User-agent: *
    Allow: /

    I don’t understand what is causing the problem. I looked at the sitemap.xml file and could not see anything wrong.
    Regards,
    Mark …

  344. Jim says:

    Hi Asha,

    When you visit the page with your browser, you’re making one request, but the sitemap generator default to 5 requests at the same time (can go up to 9 requests), so your results doing it individually will be different. What you can do is bump up your Request Delay from 0.0s to something like 1s – Look under the Crawler Tab for these settings.

    Thanks for sharing and the kudos on the software :)

  345. Asha says:

    I had problems with low memory also. I upped the max jre memory usage to 256 meg (java -Xmx256m) which solved the problem.

    The issue I cannot solve is the generator tends to give me many timeout errors waiting to connect when crawling my pages, even though when I go to the pages directly via web browser the page is displayed without issue. Sometimes I have to reload the sitemap generator a few times before it works reliably. Other than that it’s a great piece of software. Many thanks for releasing it.

  346. Stephen says:

    Hi Jim

    Have found your sitemap generator because I wish to submit a sitemap to Google.

    My site is worldaudiobookclub .com which has 12000-14000 audiobook title, which is not the problem. As each page open – with each unique title as its centrepiece – a window of New releases, Recommended and Bestsellers also appear to accompany the selected title. Each new visit to each pages brings a refreshed selection, meaning that each title can appear once as its own title ID, but many other times in the adjacent panels, randomly populated by ‘the system’.

    Thus, my sitemap recognizes approx 17000 URLs, but each of which can have up to 20+ extra titles.
    This may well be creating a maze more than a map!

    Your software appears to queue them all correctly, but after about 8000 ‘finished’ results, it stalls and won’t go further. The last 50-80 entries list as ‘Failed’ – Low memory…

    Is this because my PC’s memory is full, there is a limit om the size of the XML file that can be created, or what? Even better, what settings do I apply to not include the extra links, or conversely, to only include the titleID pages?

    Looking forward to your help. And I’ll get you a Chai as well – it’s one of my favourites.

    Kind regards

    Stephen Barrett
    New Zealand

  347. Tim says:

    Thanks Jim i have been looking for months for such a tool and i can’t believe the price. I know time is a very illusive creature but let me say your service to the community as a whole is priceless! I have been struggling for a while now trying to get my site indexed by the big 3 i’m going to use this tool regularly, i am sure i’ll have some questions before i am satisfied with results but i cannot stress how much i appreciate your service and information you provide.

    I’ll proudly put a link in the footer of my site for you!
    Thanks, Thanks, Thanks

  348. Denny says:

    Hello..Good service bu i have some trouble.
    On my site have russian words and sitemaps xml notunderstand my language.
    So what i can do?

  349. Rob says:

    My sitemap.xml file does not display as a formatted xml file in the browser. Am I missing something? It displays as just one long string of text (VERY long!).

  350. anonymous says:

    I can’t get what to do. I mean how to begin that process. I don’t see any link where I can access the interface of sitemapgen.

  351. Jim says:

    Hi Bruce,

    At this time, only test/html files are included in the xml sitemap. The PDF is a application/pdf type.

    It is on my list to add this down the road, but time is an elusive friend.

  352. Bruce Bowman says:

    Hi Jim

    RE- XML Sitemap Tool

    Although I successfully created a sitemap.xml file for my web site using your Sitemap Generator, when I saved the sitemap file, NO listings were included for the many PDF files that I have on the web site.

    However, all [paths to] PDF files on the site were included in the generated sitemap list… just not saved to the sitemap.xml file.

    How to I force the generator to include PDF files in the final sitemap.xml saved to my hard drive?

    When I used the limited edition Google Sitemap generator, PDF files were included in the sitemap.xml file.

    Thanks
    Bruce

  353. Kevin Wallace says:

    Great site map tool and a real treasure for an enthusiastic but amateur webmaster.

  354. Bruce Bowman says:

    Although I successfully created a sitemap.xml file for my web site, when I saved the sitemap file, NO listings for the many PDF files I have on the web site were included..
    However, all paths to PDF files on the site were included in the generated sitemap list… just not saved to the sitemap.xml file.
    How to I force the generator to include PDF files in the final sitemap.xml saved to my hard drive.

    Thanks
    Bruce

  355. Bruce says:

    I like your tool and use it when creating almost all of my sitemaps. I have a website I’m using it on which keeps returning not only the http versions of the urls, but also the https versions of the same urls. I have put code in most of the files it’s returning to generate a dynamic META robots tag with “noindex,nofollow” when accessed via port 443. This seems to work fine via the browser (I see the tag in the source), but the sitemap tool still lists these urls even though I have the box checked to use the meta robots tag. I’m trying to figure out if the search engines would be doing the same thing or if this is an issue with the tool, or if there’s more I need to do with the code.

  356. Remi says:

    Hello

    I have used XML Sitemap Tool. Could you tell us more about this error :
    “unexpected end of file from server”

    Regards
    Remi Brandini

  357. Frederick Gella says:

    I have a problem with my website. When I burn my rss feeds to feedburner.google.com I always encountered the error: The URL does not appear to reference a valid XML file. We encountered the following problem: Error on line 15: Open quote is expected for attribute “{1}” associated with an element type “language”.

    I have checked already my template code and I have even deleted already the meta with language in it but still the problem.

    Please help me.

  358. kral oyun says:

    Nice sitemap builder, Thanks!

  359. Dave says:

    Hi. Nice generator. Does it create multiple sitemap files on export? I heard Google has a 50k limit per file so we should use multiple files.

  360. Muhammad waseem says:

    It is very nice website for making site maps and found it easy to use, thank you.

  361. Linda J. says:

    Hi Jim,

    I am a newbie with sitemaps but I love the Webmaster Tool. My problem is, I have one error that I can’t seem to correct. It is a .jpg file but is listed as a MimeType test/html. I went to the web page and everything looks fine in edit. How can I change this 1 error?

  362. Basil Halhed says:

    Re: XML Sitemap Tool (Cont’d)

    Hi Jim,

    Using Firefox, set to level four, your app continues to level 5 (at which point I stopped — level five takes many hours to complete, even with 9 threads).

    Thanks again for your help and support.

    Basil

    Answer: Level 0 is root, level 1 is 1 level down – perhaps you wanted level 4, not 5.

  363. Jim says:

    Hi Jim,

    When you choose export, a save window pops up and will save it in your default documents folder as New sitemap_sitemap.xml IF you don’t choose where you want it stored. To choose where you want it stored, just click the yellow folder with the green up arrow in Windows (standard icon and navigation) to choose a different folder. To choose a different filename, simply change the name (it is highlighted by default).

  364. Jim Hopkins says:

    Whenever I export my sitemap i can not find it so i can upload it to my website. I have tried saving it to my desktop and to my website directory on my PC and still can not locate the file.

  365. Jim says:

    Thanks for the comment Basil!

    I fixed the screen in IE – thanks for pointing that out to me!

    As for Firefox, what happens when you set it to Max Level = 4?

    Best,

    Jim

  366. Basil Halhed says:

    Bug Report XML Sitemap Tool
    ===========================
    I do find that your sitemap tool is the very best — thanks for your effort.

    A couple of bugs in the new version:
    Running in Firefox v3.6.8, Allow *.html, *.php, disallow *.js *.css no images, starting at the root, if I set Max Level = 5, it continues beyond to level six… hundreds of thousands of entries being generated. With php and a large database and many combinations and permutations,, max level is important to keep the number of pages indexed to about 50K, the more basic entries.

    Using IE v8.0.6001.18702, (same settings) each tabbed window shows up as perhaps several thousand pixels high. Your app. has its own separate window in IE (whereas in Firefox, it uses the main browser window). In IE, your app’s scroll-thumb can be off the bottom of the IE window — perhaps challenging for the neophyte as some of the option settings aren’t visible. Max Level seems to work under IE.

    (I haven’t tried your app using Chrome or Opera).

    I’m using XP Pro, SP3, Java build 1.6.0_21, memory set to -Xmx1024m, Intel dual core E6750 cpu, 4GB RAM.

    Thanks again for a fine program!

    Basil

  367. Christelle Sneider says:

    Hello,

    I just wanted to thank you for providing google Sitemap Generator.

    And I thank you Jim for your detailed and very helpful answer. I was going to post the same query as Jerry, I faced a similar problem.

  368. Pauline Matthews says:

    Hi, Thank you very much. I was at a loss as how to generate a sitemap. This has worked in Google so I now have to submit to Yahoo etc.Regards, Pauline

  369. midocu says:

    I tried it and this is what Google responded with Unsupported file format?

    Hi Midocu,

    You exported a HTML file and not a sitemap file – choose sitemap file, see my comment below.

    Jim

  370. Germaine Shaftic says:

    I often use an service internet to generate an xml website. This is a great tool , thanks you very much.
    I am thinking of create an php CRON JoB that can generate sitemap on the server … Can you help me ?

  371. Jim says:

    Hello,

    Quick Answer: No

    Long Answer: Anyone can download your sitemap file if they know the name of it and now people are referring to it in their robots.txt file, which anyone can see. If the sitemap is not in the robots.txt file *and* named something other than sitemap.xml *and* that name is not referenced anywhere on your website then only the search engines you registered the sitemap with would know the name.

    Keep in mind, that passwords are not included in sitemaps, so even if someone did get your sitemap file, they could not figure out your passwords.

  372. stun-gun says:

    Hey I was just curious, if you have a sitemap for spiders to follow where everything is on your site, is there a way that someone could download that file then be able to access a site that had a password to enter it?? Just curious….

  373. Jim says:

    Hello Jerry,

    You exported a HTML file or something other than the sitemap xml file that Google needs. Here is what you do:

    1) Run the sitemap generator on your site.
    2) Click on the Sitemap Tab
    3) Click on the Export Tab
    4) Select Sitemap XML from the drop down list

    Those are the steps needs to create the sitemap xml file that you’ll submit to Google.

  374. Jerry says:

    I tried it and this is what Google responded with:

    Unsupported file format
    Your Sitemap does not appear to be in a supported format. Please ensure it meets our Sitemap guidelines and resubmit.

  375. Jim says:

    Hi Jos,

    I’ve added the regular expressions commands to the main page above, so check those out under include filters. I have some examples that will help you.

    Thanks,

    Jim

  376. nik says:

    Hey i am trying to generate sitemap for a website with 50k+ pages, so i want to break it up by subdomains. But not sure how to generate a sitemap for a perticular subdomain only.
    Thanks
    Regards
    Nik

  377. Jim says:

    Hi Ted,

    You have to start a new project – if you simply run the sitemap tool again, it won’t change any errors, just display new ones. Just start a new project and you’ll be all set.

    And yes, the ability to find your errors and what has caused them is my favorite (along with titles and times).

    Thanks,

    Jim

  378. Jim says:

    Thanks Ted, I’ve made the change!

  379. Ted says:

    The Yahoo submission link you have as part of the instructions above is no longer available

    I found it at – siteexplorer.search.yahoo.com/submit

    Take care,
    Ted

  380. Ted says:

    First, thank you! The video was very helpful in explaining your very useful tool.

    Being a professional software developer and loving to get feedback from my users, I have two things I wanted to let you know about:

    1) When I went to the “Sitemap” tab, I had to scroll the window down to see the “Buttons” and “Report information” at the bottom. This behavior was the same in Google Chrome v5.0.375.99 and Internet Explorer v8.0.7600.16385 . Both were run in a maximized window.

    2) As in your video, my sitemap generation showed me an error (Wow – I absolutely love that feature). I fixed the problem on the website and wanted to re-run the sitemap generation. I thought “Clear entire sitemap” (Right click on the sitemap results) might get the software back to a start-state, but it did not appear to. It showed the same problem – even though the problem was fixed.

    Thanks again for a great utility!

  381. Jim says:

    Hi Emran,

    You need to prefix your regular expressions with “complex:”.

    Try this command (remove the space):
    complex:h ttp://[yousite]/shopping/men/[A-Za-z0-9/]*$

    If you don’t specify it with complex: then whatever you enter will be treated as a simple expression.

    Let me know how that works for you :)

    Thanks,

    Jim

  382. Emran says:

    Hi Jim…

    Trust all is well, just wondering if a fix for the regular expression issue had been posted?

    Hear from you soon.

    Regards, Emran

  383. Tyler says:

    I tried using your sitemap generator for [snip] (please keep private), but I had a problem similar to others who posted comments earlier. It just says that I sessions are respected and I should log out before proceeding, but it gives no way to proceed.

    Thanks.

  384. Emran says:

    Jim – thats fantastic !!!!

  385. Jim says:

    Hi Emran,

    Yes, I’m looking into this and will post a fix – thank you for your descriptive comment, it makes it a lot easier to test!

  386. Emran says:

    Hi

    I’m trying to generate feed for my website, however I only require the product pages for a specific top level category.

    I’m using the include filter option to enter the following regular expression:

    /shopping/men/[a-z]*/([0-9][0-9]|[0-9])/([0-9][0-9]|[0-9])

    this should return the following examples but doesn’t.

    http:// [mysite.com]/shopping/men/casual/21/2
    http:// [mysite.com]/shopping/men/sports/1/21 and so forth

    The regular expression has been checked against an online validator and works fine.

    Any ideas? Please help

  387. Suffi says:

    Jim, it means if i do it well, it is correct or not, Should i add my xml URL into google.
    Tnx

  388. Robin says:

    Hi Jim, just to let you know I did have to reinstall Java, works fine now. Thanks for everything!

  389. Jim says:

    Hi Robin,

    You need Sun’s Java to run the tool.

    Visit java.com/en/download/installed.jsp and let me know what version you are running.

    Thanks,

    Jim

  390. Robin says:

    Hi Jim, I don’t have java.exe running, its still just giving the sentence on top of browser, nothing else.

    I tried in IE as well.

  391. Jim says:

    Hi Robin,

    This issue has been fixed and you should not encounter it again.
    This happens to me from time to time and I have no idea why, but my solution is to kill the java.exe process (I use process explorer by Sysinternals (Microsoft) ) and close all my browsers, then try again. There have been times when I needed to do this multiple times before it finally takes.

    This started happening with the new version of Java a while back and I’m still investigating why and how I can fix this.

    Do me a favor will you? Let me know if this works for you and how many times you have to do it.

    Best regards,

    Jim

  392. Robin says:

    Hi Jim, sorry to bother you at this email, but I was going to use your xml sitemap generator. I’ve used it in the past because it seems to index stuff my version of xml sitemaps doesnt such as 301 redirects so I hear.

    Problem is In both Firefox and IE im getting this only on the page

    This tool respects sessions, so make sure you are LOGGED OUT of your website BEFORE generating a XML SITEMAP!

    A blank page comes up when I click on the icon, only the initial message at the top is showing.

    Any ideas would be appreciated. The applet isn’t showing up.

    Kindest regards

    Robin Downey

  393. Jim says:

    When you say placed well, what exactly are you looking for Suffi?

  394. Suffi says:

    Hi,
    I’m not the webmaster, But i have generate the XML site map for my site and i still don’t know if everything’s placed Well or Not, Can anyone help me
    my site: irtouring. com/textpage.asp?pageID=36
    Thanks

  395. Nikhil says:

    Good Stuff!! Helps me improve more sitemap generation!

  396. Jim says:

    Got it – I’ll work on this later and let you know what I came up with. It probably won’t be until after the 4th…

    Enjoy!

  397. Jos says:

    Currently I have a sitemap containing every single html page, about 4000, which I want to reduce to only the introduction pages, starting a subject in three languages and the index pages of all photo albums, but no longer every html page that contains the enlarged photo.
    Regards, Jos

  398. Jim says:

    I’m sorry Jos, it’s Saturday, have family and friends coming over soon and not into the problem solving mode like I usually am. What exactly do you want to do? Have a sitemap with only index files? I’m trying to get this wrapped up before everyone arrives.

  399. Jos says:

    Hi Jim,

    No, it’s not.

    I entered the url vanderburgt. eu/holidays/citytrips.html
    Then I only use include filter *index* and put a checkmark in the box exclude images. It gives me 17 html pages that lead to the index thumbnail pages mentioned on the page.
    Removing *index* gives me all html pages on the website.

    Regards,

    Jos

  400. Jim says:

    Hi Jos,

    Is the url you are working on the one you mentioned in your previous comment? If so, I’ll try and create a xml sitemap for that – also, send me the excludes / includes you are using.

    Thanks,

    Jim

  401. Jos says:

    Hi Jim,

    If I don’t use any filter all urls on the entire website are found, but as soon as I put *index* in the filter box only “index” pages will be found on the page I entered in the URL address box. No links are followed.

    Regards,

    Jos

  402. Chris says:

    Hi Jim,

    Thanks for looking at this.

    OK so I ran it on your testingiam.com site and it worked perfectly, and reloaded the saved project file with no problems

    I ran it on my site, but added a few more excludes so that I only had 3,000’ish pages, it worked perfectly and reloaded the save project file with no problem.

    I ran it again on my site and reduced the number of excludes giving me 13000+ pages in the sitemap, and it also worked perfectly and reloaded the project with no errors.

    If you changed something then it has fixed the problem, if not, the problem must have been at my end and I apologise for wasting your time. I have no idea what the problem could have been unfortunately as I don’t believe that I changed anything between the two attempts at indexing the site.
    All I can think of is that I may have had apache and mysql running when the reload failed, so my system may have been a little constrained on memory.

    Thanks again for the tool, and sorry to waste your time.

    Regards Chris

  403. Jos says:

    Hi Jim,

    The .htaccess file was something I wondered about myself in a message to you. I’m afraid it got mixed up on this page containing different posts about various subjects.
    The only error I get is testingiam.com/../index4.html but testingiam.com/index4.html was found.

    Regards, Jos

  404. Jim says:

    Hi Jos,

    Did I mention the .htaccess file? I reread my response to you and didn’t see anything like that…

    And when you ran the sitemap generator on testingiam.com, there should be three errors which are included on purpose, they are index4.html, tesing.html and product1.php

    When you run the sitemap generator on testingiam you should receive the same results.

  405. Jos says:

    the .htaccess file cannot be the problem, because I removed it temporarily while trying to generate a sitemap. Still no results using filter *index*.html

    Thanks – Jos

  406. Jim says:

    Hi Chris,

    I put it to the test, added my filters, ran the sitemap and saved out the project.

    I then selected New Project, then loaded my saved project file (New sitemap.xml) and evertyhing was there.

    How big is your project? The site I gave you testingiam.com, is a small site I set up with valid and invalid urls for testing purposes. If you run the sitemap generator on testingiam.com, do you also receive an error?

    Thanks – Jim

  407. Jim says:

    Hi Bjantiques,

    This can happen from time to time and is often a result of an old version of Sun’s java. If you have the latest, then you can terminate the java.exe process and try again. I should mention, that I’ve had this happen (on occasion) even when the latest version is installed – I simply terminate the java process and try it again. Why this happens I have not a clue, and apologize.

  408. Bjantiques says:

    Addendum info found after further investigation.
    This may help cure the issue at hand.

    I opened page speed and found this error message

    Avoid bad requests
    The following requests are returning 404/410 responses. Either fix the broken links, or remove the references to the non-existent resources.

    * auditmypc. com/jmaster.webtool.app.WebToolApplet.class

  409. Bjantiques says:

    when on this sitemap generator page, I do as instructed and click on the image that has on it Start Now Click on the image.

    This opens a new tab in IE then I see a java image ( i assume it is flash ) displayed and then – nothing more happens. If i try to close out the tab or the browser nothing happens apart from the usual MS dunk sound that tells you it is not going to react.
    I have to use Task manager to terminate task.

    With FireFox it opens a new tab shows the printed warning about respecting cookie sessions and then yet again – Nothing happens.

    The good part is that I can close the Tab so its not locking up my FF browser.

    Please do not ask me what website it is failing on as you did the others with the exact same problem as we have not got that far – before putting in a website you have to get the initial page open.

  410. Jim says:

    Thanks for the reminder Chris, I completely forgot this and will do so today. If you don’t see something here tonight, then bump me – I’m working on converting a big site and it’s taking more of my time than I ever expected!

  411. Chris says:

    Hi Jim,

    RE: java.lang.NullPointerException on project reload

    Did you manage to duplicate the problem?
    I am not hassling for a fix but if you cannot duplicate the problem then I should look at something locally to see if its a problem at my end.

    Thanks

  412. Jos says:

    Hi Jim,

    On the sitemap tab I found 8 html files and 1 “failed” entry on testingiam.com using *index*.html.

    So there must be something something else wrong.

    .htaccess perhaps?

    Regards,

    Jos

  413. Jim says:

    Hi Jos,

    I set up a test server you can run the sitemap against. Check out testingiam.com and for the include filter, use:
    *index*.html

    You’ll notice that the files:
    indexme.htm
    index.php
    where not included, along with anything that did not have index in them.

    Once you run the sitemap on this site, let me know the results, then I’ll look at yours – I want to confirm you are getting the same results as me before we dive deep into this.

    Regards,

    Jim

  414. Jos says:

    Hi Jim,

    An example of a index page containing tumbnails is
    vanderburgt.eu/croatia/svetijure/index3.html

    Jos

  415. Jim says:

    Hi Liddia,

    Here is a copy of a robots.txt file with the sitemap included. When the bots, be it Google, Microsoft and any other search engine, visit the site, they look at the robots.txt file to find out what they should include. When they see the sitemap heading, they will read that and may use it to help spider your site.

    User-agent: *
    Disallow:

    Sitemap: http: //www.auditmypc.com/sitemap.xml

  416. Liddia J says:

    How do I include the file in my robots.txt file?

  417. Jim says:

    Hi Jos,

    Can you give me an example of your site and I’ll take a look – by the way, you do not want to use index anything if you can avoid it. I’ll explain more once I see your site.

    Best regards,

    Jim

  418. Jim says:

    Leave a comment if you have any questions Rajesh.

  419. rajesh says:

    I am new to the world of webmasters and came to know about site maps in your site. i am going to create one for my site. thank you for the information.

  420. Jos says:

    Hi Jim,

    Obviously I have a problem understanding wildcards.
    When I use include filter *.html all html pages are included.
    If i want to include only pages like index.html or index2.html or indexwhatever.html I thought *index*.html would be the right definition, but unfortunately nothing happens. What can I do?

    Regards,

    Jos

  421. Peter says:

    Hi
    I set the row filter change frequency and the priority but it does not transfer to the site map I have to put these in manually, is there a way to have these set as a default.

    Thanks
    Peter

  422. Jim says:

    Hi Chris,

    I’ll run a sitemap on one of larger sites and see if I can duplicate the problem. Thanks for taking the time to make me aware and for the kudos :)

    Jim

  423. Jim says:

    Hi Colin,

    Glad to hear it’s all working! and thanks for the mention – may good rankings come your way :)

    Jim

  424. Colin Bryant says:

    thanks Jim,

    it worked this time – I must have used the ‘save as’ option under ‘file’ in the browser rather than the Java ‘Export’ button.

    Excellent tool!

    I’ll do a blog entry for webmaster tools and link to your site for my readers.

    Col

  425. Chris says:

    Hi,

    I would like to report a few problems when reloading a project from disk.

    I always get a java.lang.NullPointerException when reloading and it never replaces the project details page which contains the include and exclude lists.
    I assume this is done after the null pointer error is received.

    The sitemap page appears to be fully reloaded.

    I have tried this in Firefox and IE8

    The project in question contains 10,000+ pages so its not small but also not the largest on the web either.

    I am running:
    Firefox 3.6.3
    IE 8 fully updated
    Vista Ultimate SP2
    1 gig mem
    Java memory set to 384m was 256 but i upped it to see if that would help and it didnt.

    website = danastock.de

    Hope you can help because otherwise this sitemap tool rocks.

    Chris

  426. Jim says:

    Hi Col,

    How to make a Google XML SitemapYou exported the wrong file and submitted that to Google as a sitemap (it is a common mistake).

    Once the sitemap generator is finished indexing your site, choose Sitemap, Export then Sitemap XML and give it a name that you’ll remember. Now upload this to your website and tell Google or Microsoft where your XML file is located (that would be your website address/name of your exported file).

    Click on the image to enlarge.

    This works like a charm. Also, I just created a sitemap with different priority settings (1, .09, .08, …) and submitted it to Google and it was verified and read shortly after.

    Hope that helps :)

    Jim

  427. Jim says:

    Hi Mary,

    I’ve thought about that but using a plug would not allow you to see your site like the search engines do, entirely… My sitemap generator is also a tool for website owners. It allows you to see response times, discover errors such as invalid relative links, html errors (including what caused those errors), titles / mimetypes, encoding, incoming and outgoing links and a host of other information – much of which would be lost if I made this into a plugin.

    If all you need is a basic sitemap for WordPress, then a plugin is the way to go and faster; however, if you want your site error free, are working on optimization and want to see your site the way the search engines do, then my sitemap generator is the tool to use.

    By the way, if you are generating a sitemap for a site built using WordPress, you’ll find these exclusions very handy!

    */trackback/*
    */feed/*
    */feed
    */comments/*
    */tag/*
    */author/*
    */wp-content/*
    *xmlrpc*
    *wp-admin*
    *.jpg
    *.ico
    *.gif
    *.jpeg
    *.css
    *.xml
    *.zip
    *.swf

  428. Mary says:

    I am using WordPress for my website, do you have a sitemap generator plugin?

  429. Col says:

    just seen this on the Google sitemaps help for this error:-

    “That the namespace in the header is “http://www.sitemaps.org/schemas/sitemap/0.9″. Note that this must end in 0.9. If it ends in .9, you’ll see an error.”

    That namespace doesn’t seem to appear at all in my file??

  430. Jim says:

    Hi Col,

    What is the name of your sitemap and I’ll take a look and tell you what’s going on.

    If you don’t want the url posted for all to see, just let me know in the comment.

    Best regards,

    Jim

  431. Col says:

    hi,

    saved a XML Sitemap listing about 9000 pages at telecomsadvice.org. uk, put it in the index directory and submitted it to Google – got the same error message as Gerry above – unsupported file format.

    When I opened the xml file to look at it, all the URLs and descriptions are gobbledegook (or is that encoding) unlike other xml generators I have used.

    Any clues as to where I went wrong please

  432. Jim says:

    I added a new certificate to the sitemap, so those of you that were receiving an invalid certificate should no longer have this problem.

    The problem was that without a valid certificate, visitor’s would receive a warning before running the sitemap generator. To remove that warning, it runs about $300 per year (not fair).

    Certificates cost money, and paying $300 per year was not worth having that message removed but recently, I found a code signing certificate cheap so all should be good for the next three years. Enjoy!

  433. Femi Ojemuyiwa says:

    doesn work in firefox 3.6.3 got it to work in IE

  434. tangologix says:

    Nice info provided my friend. going to create my sitemaps following your info. thanks for posting

  435. Sunbizar says:

    sunbizar-technologies. com, how do I add sitemap on this site?

  436. Jim says:

    Hi Betsy,

    I can’t help if I can’t see the website you are trying to build a sitemap for. Leave me a comment and I’ll take a look. If you want the website address to remain hidden, then just say so in the comment and I won’t publish it – everything here is moderated, so it won’t show unless I approve it – too much spam…

  437. Betsy says:

    Jim,
    Unfortunately, I’m having the same problem as Brian.

    When I press the”sitemap generator tool” button, I get a blank page with the heading, “This tool respects sessions, so make sure you are LOGGED OUT of your website BEFORE generating a XML SITEMAP!” Nothing else on the page — no buttons/links/or sitemap tool.

    I do not use log-ins with my website, I have the required java add-on, I use firefox as a browser, I’ve cleared my cache & data. What should I do to get beyond this dead-end stop? I would love to use your tool :)

  438. Jim says:

    Hi Brain,

    There should be no problem with multiple IP addresses and the warning is simply letting you know that if you are logged into the site, something that has an administration section. Some content management systems, when logged in, will present a delete page or delete post button without confirming your request (sign of poor coding); in cases like this, the sitemap generator may follow the delete link and you would lose your data. So, I issue a warning that it respects sessions.

    So, I’m guessing you are having a problem and associating it with this, but that’s probably not the case. If you leave me a comment with the site(s) you are trying to generate a sitemap for, I can take a look.

    Better yet, can you word your question another way, perhaps I am not following :)

  439. Brian says:

    Hi there I host my severs from my home and I have 5 static ip addresses and when I go to create sitemap I get the error:
    “This tool respects sessions, so make sure you are LOGGED OUT of your website BEFORE generating a XML SITEMAP!” anyways around this? Als I made sure I was logged out and on a different ip address. Is there any way to solve this?

  440. Jill says:

    Jim, Thanks for the tip, I had no idea!
    :) Jill

  441. Jim says:

    Hi Jill,

    Easy Solution! Simply add this to the include filter:
    */~irlmayo2/*

    Whenever you have a website hosted off of a main site like rootsweb or sites.google.com/site/[Your website] and you want to generate a sitemap, you need to use filters – in this case, I used the include filter.

    This include filter tells the sitemap generator to retrieve anything that starts in the /~irlmayo2/ folder.

    Enjoy!

  442. Jill says:

    I used your sitemap generator once before and it worked quickly and quite well..this time when I put in my websites URL…. it looked like it was trying to index the entire rootsweb genealogy website.
    I put in my site rootsweb.ancestry.com/~irlmayo2/
    Any ideas as to what went wrong??

  443. Gerry says:

    Hi,

    I uploaded my sitemap.xml to my ftp.

    I then submitted the url, bnbliving. com/sitemap.xml, to Google (dashboard/submit sitemap).

    I got the following error:

    Unsupported file format
    Your Sitemap does not appear to be in a supported format. Please ensure it meets our Sitemap guidelines and resubmit.

    Any idea where I might have gone wrong?

    Thanks,

    Gerry

  444. amjad khan says:

    what is this

  445. Stevo Landis says:

    Thanks! This was fast, easy, and only missed one page…

  446. Gary says:

    I try to export the sitemap from the generator to a folder I created on desktop, but i cant find the xml in the folder… so how I can get the xml file from the generator?

  447. Gidi Gottlieb says:

    Hi and tx for great tool you share with us!!

    Still, I could not find enough information, on cases where the program won’t index big part of your site – even though the pages are not password related.
    I deselect the robot, meta etc – of-course

    Example: in helena4love. co. il there are about 5000 pages of the type (example) helena4love .co. il/user_page.asp?site_lan=&user_Id=12588
    that the sitemap generator won’t index.

    Any tips will be more than appreciated.

    Gidi

Leave a Reply

Your email address will not be published. Required fields are marked *

Footer

Miscellaneous

  • Free Address Finder
  • HTML Encoder Decoder
  • Website Monitoring
  • Whats My IP Address?
  • Yes or No

Copyright © 2001-2024 Audit My PC .com All Rights Reserved. Our Privacy Policy and TOS

  • About
  • Acronyms
  • DLL Files
  • Ports
  • Computer Security News
  • Email Scams & Spam
  • Internet Safety
  • Free Software